582 lines
16 KiB
Markdown
582 lines
16 KiB
Markdown
# Policy & Exploration Guide — DualEA System
|
|
|
|
**ML policy gating, fallback modes, and exploration system**
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Policy System Overview](#policy-system-overview)
|
|
2. [Policy Structure](#policy-structure)
|
|
3. [Policy Gating Logic](#policy-gating-logic)
|
|
4. [Policy Fallback Modes](#policy-fallback-modes)
|
|
5. [Policy Scaling](#policy-scaling)
|
|
6. [Exploration Mode](#exploration-mode)
|
|
7. [Exploration Caps](#exploration-caps)
|
|
8. [Troubleshooting](#troubleshooting)
|
|
|
|
---
|
|
|
|
## Policy System Overview
|
|
|
|
### Purpose
|
|
|
|
The **policy system** allows ML models to influence trading decisions through:
|
|
- **Confidence gating**: Block low-confidence predictions
|
|
- **Parameter scaling**: Adjust SL/TP/lots based on model output
|
|
- **Risk management**: Dynamic position sizing based on predicted outcomes
|
|
|
|
### Files
|
|
|
|
**policy.json**: ML-generated trading policy
|
|
- Path: `Common/Files/DualEA/policy.json`
|
|
- Generated by: `ML/policy_export.py`
|
|
- Structure: Per-slice (strategy|symbol|timeframe) probabilities and scaling multipliers
|
|
|
|
**policy.backup.json**: Last-known-good policy
|
|
- Used for rollback on parse/plausibility failures
|
|
- Updated on successful policy load
|
|
|
|
**policy.reload**: Trigger file for hot-reload
|
|
- PaperEA_v2: Hot-reload supported via `policy.reload` + HTTP polling
|
|
- LiveEA: Load on init only (restart required for updates)
|
|
|
|
---
|
|
|
|
## Policy Structure
|
|
|
|
### JSON Schema
|
|
|
|
```json
|
|
{
|
|
"version": "1.0",
|
|
"generated_at": "2025-01-15T10:00:00Z",
|
|
"model_hash": "abc123...",
|
|
"train_window": {"start": "2024-01-01", "end": "2025-01-15"},
|
|
"metrics": {
|
|
"roc_auc": 0.72,
|
|
"brier_score": 0.18,
|
|
"expected_r": 0.65
|
|
},
|
|
"min_confidence": 0.55,
|
|
"slices": [
|
|
{
|
|
"strategy": "ADXStrategy",
|
|
"symbol": "EURUSD",
|
|
"timeframe": 60,
|
|
"probability": 0.75,
|
|
"sl_mult": 1.0,
|
|
"tp_mult": 1.2,
|
|
"lot_mult": 1.0,
|
|
"trail_mult": 1.0
|
|
},
|
|
...
|
|
]
|
|
}
|
|
```
|
|
|
|
### Fields
|
|
|
|
**Global**:
|
|
- `version`: Schema version
|
|
- `generated_at`: Timestamp of policy generation
|
|
- `model_hash`: Trained model identifier for provenance
|
|
- `train_window`: Data range used for training
|
|
- `metrics`: Model performance metrics
|
|
- `min_confidence`: Global minimum confidence threshold
|
|
|
|
**Per-Slice** (as implemented in LiveEA; PaperEA_v2 uses minimal parsing):
|
|
- `strategy`: Strategy name (e.g., "ADXStrategy")
|
|
- `symbol`: Trading symbol (e.g., "EURUSD")
|
|
- `timeframe`: Timeframe in minutes (e.g., 60 for H1)
|
|
- `p_win`: ML model confidence (0.0-1.0)
|
|
- `sl_scale`: Stop loss scaling multiplier (optional)
|
|
- `tp_scale`: Take profit scaling multiplier (optional)
|
|
- `trail_atr_mult`: Trailing stop ATR multiplier (optional)
|
|
- `confidence`: Alternative confidence field (optional)
|
|
|
|
---
|
|
|
|
## Policy Gating Logic
|
|
|
|
### Configuration
|
|
|
|
```cpp
|
|
input bool UsePolicyGating = true; // Enable ML policy gating
|
|
input bool DefaultPolicyFallback = true; // Enable fallback modes
|
|
input bool FallbackDemoOnly = true; // Restrict fallback to demo
|
|
input bool FallbackWhenNoPolicy = true; // Fallback if policy not loaded
|
|
input bool FallbackWhenSliceMissing = true; // Fallback if slice missing
|
|
```
|
|
|
|
### Decision Flow
|
|
|
|
```
|
|
Is UsePolicyGating=true?
|
|
No → PASS (no policy gating)
|
|
Yes → Continue
|
|
|
|
Is policy loaded (slices > 0)?
|
|
No → Check FallbackWhenNoPolicy
|
|
Yes → Check FallbackDemoOnly
|
|
Demo account → FALLBACK (neutral scaling)
|
|
Live account → BLOCK
|
|
No → BLOCK (reason: "no policy loaded")
|
|
Yes → Continue
|
|
|
|
Does exact slice exist (strategy|symbol|TF)?
|
|
No → Check FallbackWhenSliceMissing
|
|
Yes → Check FallbackDemoOnly
|
|
Demo account → FALLBACK (neutral scaling)
|
|
Live account → BLOCK
|
|
No → BLOCK (reason: "policy slice missing")
|
|
Yes → Continue
|
|
|
|
Is slice.probability >= policy.min_confidence?
|
|
No → BLOCK (reason: "confidence too low")
|
|
Yes → PASS + Apply scaling
|
|
```
|
|
|
|
### Code Example
|
|
|
|
```cpp
|
|
bool ApplyPolicyGating(SignalData &signal, string &reason) {
|
|
if(!UsePolicyGating) return true;
|
|
|
|
// Check if policy loaded
|
|
if(policyData.slices == 0) {
|
|
return HandleFallback("no_policy", signal, reason);
|
|
}
|
|
|
|
// Find policy slice
|
|
PolicySlice slice = policyData.Find(
|
|
signal.strategy_name,
|
|
symbol,
|
|
timeframe
|
|
);
|
|
|
|
if(slice.probability < 0) { // Slice not found
|
|
return HandleFallback("slice_missing", signal, reason);
|
|
}
|
|
|
|
// Check confidence threshold
|
|
if(slice.probability < policyData.min_confidence) {
|
|
reason = StringFormat("confidence %.2f < %.2f",
|
|
slice.probability, policyData.min_confidence);
|
|
return false;
|
|
}
|
|
|
|
// Apply policy scaling
|
|
ApplyPolicyScaling(signal, slice);
|
|
|
|
return true;
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Policy Fallback Modes
|
|
|
|
### Fallback: No Policy
|
|
|
|
**Triggers when**:
|
|
- `UsePolicyGating=true`
|
|
- Policy file not loaded OR `policy.slices == 0`
|
|
- `FallbackWhenNoPolicy=true`
|
|
|
|
**Behavior**:
|
|
- Check `FallbackDemoOnly`:
|
|
- If `true` and demo account → ALLOW with neutral scaling
|
|
- If `true` and live account → BLOCK
|
|
- If `false` → ALLOW with neutral scaling (any account)
|
|
- Bypasses insights thresholds
|
|
- Bypasses exploration caps
|
|
- No SL/TP/lot/trail multipliers applied
|
|
|
|
**Log Example**:
|
|
```
|
|
FALLBACK: no policy loaded -> neutral scaling used for ADXStrategy on EURUSD/H1 demo=true
|
|
```
|
|
|
|
**Telemetry**:
|
|
```csv
|
|
policy_fallback,EURUSD,60,ADXStrategy,no_policy,true,true
|
|
```
|
|
|
|
### Fallback: Slice Missing
|
|
|
|
**Triggers when**:
|
|
- `UsePolicyGating=true`
|
|
- Policy IS loaded (`policy.slices > 0`)
|
|
- Exact strategy|symbol|timeframe slice not found in policy
|
|
- `FallbackWhenSliceMissing=true`
|
|
|
|
**Behavior**:
|
|
- Same as "No Policy" fallback
|
|
- Allows trade with neutral scaling
|
|
- Demo-only restriction if `FallbackDemoOnly=true`
|
|
|
|
**Log Example**:
|
|
```
|
|
FALLBACK: policy slice missing -> neutral scaling used for BollAverages on GBPUSD/H1 demo=true
|
|
```
|
|
|
|
### Neutral Scaling
|
|
|
|
**Definition**: No adjustments applied, use base parameters.
|
|
|
|
```cpp
|
|
sl_mult = 1.0
|
|
tp_mult = 1.0
|
|
lot_mult = 1.0
|
|
trail_mult = 1.0
|
|
```
|
|
|
|
Trade proceeds with:
|
|
- Original SL/TP from strategy
|
|
- Original lot size from risk calculation
|
|
- Original trailing stop settings (if enabled)
|
|
|
|
### Fallback Safety
|
|
|
|
**Why demo-only by default?**
|
|
- Fallback is permissive (allows trades without ML validation)
|
|
- Intended for data collection, not production live trading
|
|
- In live, you want explicit policy coverage for all slices
|
|
|
|
**When to disable FallbackDemoOnly?**
|
|
- After verifying policy coverage is comprehensive
|
|
- When transitioning from demo to live with same strategy set
|
|
- With explicit risk acceptance of trading without ML guidance
|
|
|
|
**Best Practice**:
|
|
- Keep `FallbackDemoOnly=true` until policy proven
|
|
- Monitor `policy_fallback` telemetry events
|
|
- Aim to eliminate fallbacks by improving policy coverage
|
|
|
|
---
|
|
|
|
## Policy Scaling
|
|
|
|
### Policy Scaling (LiveEA Only)
|
|
|
|
> **Note:** Full policy scaling with per-slice multipliers is implemented in LiveEA. PaperEA_v2 has minimal policy parsing (checks `min_confidence` only).
|
|
|
|
**Scaling Application (LiveEA):**
|
|
```cpp
|
|
void ApplyPolicyScaling(SignalData &signal, PolicySlice &slice) {
|
|
// SL scaling
|
|
double slDistance = MathAbs(signal.entry_price - signal.stop_loss);
|
|
if(signal.direction == 1) { // Buy
|
|
signal.stop_loss = signal.entry_price - (slDistance * slice.sl_scale);
|
|
} else { // Sell
|
|
signal.stop_loss = signal.entry_price + (slDistance * slice.sl_scale);
|
|
}
|
|
|
|
// TP scaling
|
|
double tpDistance = MathAbs(signal.take_profit - signal.entry_price);
|
|
if(signal.direction == 1) { // Buy
|
|
signal.take_profit = signal.entry_price + (tpDistance * slice.tp_scale);
|
|
} else { // Sell
|
|
signal.take_profit = signal.entry_price - (tpDistance * slice.tp_scale);
|
|
}
|
|
|
|
// Trailing scaling (if enabled)
|
|
if(TrailEnabled && slice.trail_atr_mult > 0) {
|
|
// Apply trail ATR multiplier
|
|
}
|
|
}
|
|
```
|
|
|
|
### Scaling Examples
|
|
|
|
**Conservative Scaling** (low confidence):
|
|
```json
|
|
{
|
|
"probability": 0.60,
|
|
"sl_mult": 0.8, // Tighter stop
|
|
"tp_mult": 1.5, // Wider target (better RR)
|
|
"lot_mult": 0.5, // Smaller position
|
|
"trail_mult": 0.9 // Tighter trail
|
|
}
|
|
```
|
|
|
|
**Aggressive Scaling** (high confidence):
|
|
```json
|
|
{
|
|
"probability": 0.85,
|
|
"sl_mult": 1.2, // Wider stop (more breathing room)
|
|
"tp_mult": 0.8, // Closer target (take profits faster)
|
|
"lot_mult": 1.5, // Larger position
|
|
"trail_mult": 1.0 // Normal trail
|
|
}
|
|
```
|
|
|
|
**Neutral Scaling** (fallback):
|
|
```json
|
|
{
|
|
"probability": 0.0,
|
|
"sl_mult": 1.0,
|
|
"tp_mult": 1.0,
|
|
"lot_mult": 1.0,
|
|
"trail_mult": 1.0
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Exploration Mode
|
|
|
|
### Purpose
|
|
|
|
**Bootstrap insights** for strategy|symbol|timeframe slices that lack sufficient historical data.
|
|
|
|
Without exploration:
|
|
- New slices blocked by insights gating (no data → can't pass thresholds)
|
|
- Chicken-and-egg problem: can't trade → can't collect data → can't build insights
|
|
|
|
With exploration:
|
|
- Limited trades allowed for no-data slices
|
|
- Caps prevent excessive exposure
|
|
- Data collected → insights built → insights gating takes over
|
|
|
|
### When Exploration Triggers
|
|
|
|
**Conditions**:
|
|
1. `UseExploration=true`
|
|
2. Insights gating is enabled (`UseInsightsGating=true`)
|
|
3. No slice exists in insights.json for this strategy|symbol|timeframe
|
|
4. Exploration caps not exceeded
|
|
|
|
**Important**: Exploration bypass ONLY when slice truly missing. If slice exists but fails thresholds (low win rate), it is BLOCKED (no bypass).
|
|
|
|
### Configuration
|
|
|
|
```cpp
|
|
input bool ExploreOnNoSlice = true; // Enable exploration when no slice exists
|
|
input int ExploreMaxPerSlicePerDay = 100; // Daily cap per slice (default 100)
|
|
input int ExploreMaxPerSlice = 100; // Weekly cap per slice (default 100)
|
|
```
|
|
|
|
> **Note:** Previous documentation listed defaults of 2/3. Actual code defaults are 100/100, effectively unlimited for most practical purposes. The `UseExploration` input does not exist; exploration is controlled via `ExploreOnNoSlice`.
|
|
|
|
---
|
|
|
|
## Exploration Caps
|
|
|
|
### Cap Types
|
|
|
|
**Daily Cap**: `ExploreMaxPerSlicePerDay`
|
|
- Resets at midnight (00:00 server time)
|
|
- Per-slice basis (each strategy|symbol|TF tracked separately)
|
|
- Default: 100 trades/day/slice
|
|
|
|
**Weekly Cap**: `ExploreMaxPerSlice`
|
|
- Resets on Monday
|
|
- Week bucket = Monday of the week (yyyymmdd format)
|
|
- Default: 100 trades/week/slice
|
|
|
|
### Counter Persistence
|
|
|
|
**Files**:
|
|
- `Common/Files/DualEA/explore_counts_day.csv`
|
|
- `Common/Files/DualEA/explore_counts.csv`
|
|
|
|
**Format**:
|
|
```csv
|
|
key,date_yyyymmdd,count
|
|
ADXStrategy|EURUSD|60,20250115,2
|
|
BollAverages|GBPUSD|60,20250115,1
|
|
```
|
|
|
|
For weekly:
|
|
```csv
|
|
key,week_monday_yyyymmdd,count
|
|
ADXStrategy|EURUSD|60,20250113,3
|
|
```
|
|
|
|
### Cap Checking
|
|
|
|
```cpp
|
|
bool CheckExplorationCaps(string strategy, string symbol, int tf, string &reason) {
|
|
string sliceKey = strategy + "|" + symbol + "|" + IntegerToString(tf);
|
|
|
|
// Load counters
|
|
int dayCount = LoadExploreCountDay(sliceKey);
|
|
int weekCount = LoadExploreCountWeek(sliceKey);
|
|
|
|
// Check daily cap
|
|
if(ExploreMaxPerSlicePerDay > 0 && dayCount >= ExploreMaxPerSlicePerDay) {
|
|
reason = StringFormat("explore_cap_day (day=%d/%d, week=%d/%d)",
|
|
dayCount, ExploreMaxPerSlicePerDay,
|
|
weekCount, ExploreMaxPerSlice);
|
|
telemetry.Event("explore_block_day", sliceKey, dayCount, weekCount);
|
|
return false;
|
|
}
|
|
|
|
// Check weekly cap
|
|
if(ExploreMaxPerSlice > 0 && weekCount >= ExploreMaxPerSlice) {
|
|
reason = StringFormat("explore_cap_week (day=%d/%d, week=%d/%d)",
|
|
dayCount, ExploreMaxPerSlicePerDay,
|
|
weekCount, ExploreMaxPerSlice);
|
|
telemetry.Event("explore_block_week", sliceKey, dayCount, weekCount);
|
|
return false;
|
|
}
|
|
|
|
// Increment counters
|
|
IncrementExploreCountDay(sliceKey);
|
|
IncrementExploreCountWeek(sliceKey);
|
|
|
|
// Allow exploration
|
|
Log(StringFormat("GATE: explore allow %s on %s/%d (day=%d/%d, week=%d/%d)",
|
|
strategy, symbol, tf,
|
|
dayCount+1, ExploreMaxPerSlicePerDay,
|
|
weekCount+1, ExploreMaxPerSlice));
|
|
|
|
telemetry.Event("explore_allow", sliceKey, dayCount+1, weekCount+1);
|
|
|
|
return true;
|
|
}
|
|
```
|
|
|
|
### Resetting Caps
|
|
|
|
**Manual Reset**:
|
|
Delete counter files:
|
|
```powershell
|
|
Remove-Item "C:\Users\<you>\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\explore_counts*.csv"
|
|
```
|
|
|
|
**Automatic Reset**:
|
|
- Daily: Midnight (00:00 server time)
|
|
- Weekly: Monday 00:00
|
|
|
|
### Interaction with NoConstraintsMode
|
|
|
|
When `NoConstraintsMode=true`:
|
|
- **Insights gating**: BYPASSED
|
|
- **Exploration caps**: BYPASSED (trades allowed regardless of caps)
|
|
- **Exploration counters**: Still incremented for telemetry
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: Policy fallback always triggering
|
|
|
|
**Symptoms**:
|
|
```
|
|
FALLBACK: no policy loaded -> neutral scaling for ADXStrategy on EURUSD/H1 demo=true
|
|
```
|
|
|
|
**Diagnosis**:
|
|
1. Check if `policy.json` exists:
|
|
```powershell
|
|
Test-Path "C:\Users\<you>\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\policy.json"
|
|
```
|
|
|
|
2. Check policy load logs (enable `DebugPolicy=true`):
|
|
```
|
|
Policy loaded: 0 slices, min_confidence=0.00
|
|
```
|
|
|
|
3. Verify policy.json content:
|
|
```powershell
|
|
Get-Content "...\DualEA\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices | Measure-Object
|
|
```
|
|
|
|
**Solutions**:
|
|
- Run `ML/policy_export.py` to generate policy
|
|
- Copy policy.json to Common Files
|
|
- Verify JSON is well-formed (no parse errors)
|
|
|
|
### Issue: Policy slice missing fallback
|
|
|
|
**Symptoms**:
|
|
```
|
|
FALLBACK: policy slice missing -> neutral scaling for BollAverages on GBPUSD/H1 demo=true
|
|
```
|
|
|
|
**Diagnosis**:
|
|
1. Check policy slices:
|
|
```powershell
|
|
Get-Content "...\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices |
|
|
Where-Object {$_.strategy -eq "BollAverages" -and $_.symbol -eq "GBPUSD" -and $_.timeframe -eq 60}
|
|
```
|
|
|
|
2. No result → slice missing from policy
|
|
|
|
**Solutions**:
|
|
- Collect more training data for this slice
|
|
- Re-train model with broader coverage
|
|
- Accept fallback for new slices (data collection mode)
|
|
|
|
### Issue: Exploration caps reached immediately
|
|
|
|
**Symptoms**:
|
|
```
|
|
GATE: blocked ADXStrategy on EURUSD/H1 reason=explore_cap_day (day=2/2, week=2/3)
|
|
```
|
|
|
|
**Diagnosis**:
|
|
1. Check counters:
|
|
```powershell
|
|
Import-Csv "...\explore_counts_day.csv" | Where-Object {$_.key -like "*ADXStrategy|EURUSD|60*"}
|
|
```
|
|
|
|
2. Verify caps not too restrictive:
|
|
```cpp
|
|
ExploreMaxPerSlicePerDay = 2 // Very conservative
|
|
```
|
|
|
|
**Solutions**:
|
|
- Increase caps (e.g., 5/10 for PaperEA, 1/2 for LiveEA)
|
|
- Reset counters manually (delete CSV files)
|
|
- Use `NoConstraintsMode=true` for initial data collection (bypasses all caps)
|
|
|
|
### Issue: Exploration not triggering
|
|
|
|
**Symptoms**:
|
|
```
|
|
GATE: blocked ADXStrategy on EURUSD/H1 reason=below_winrate (WR=0.42 < 0.50)
|
|
```
|
|
(Slice exists but fails threshold, no exploration bypass)
|
|
|
|
**Diagnosis**:
|
|
Exploration only triggers when **slice missing entirely**, not when slice exists but fails thresholds.
|
|
|
|
**Solutions**:
|
|
- This is correct behavior (no-slice-only bypass)
|
|
- To allow trading despite low performance:
|
|
- Lower insights thresholds temporarily
|
|
- Use `NoConstraintsMode=true`
|
|
- Delete slice from insights.json to force exploration
|
|
- Wait for performance to improve
|
|
|
|
### Issue: NoConstraintsMode not working
|
|
|
|
**Symptoms**:
|
|
Still seeing gate blocks despite `NoConstraintsMode=true`
|
|
|
|
**Diagnosis**:
|
|
1. Verify setting applied:
|
|
```cpp
|
|
if(NoConstraintsMode) Print("NoConstraintsMode is TRUE");
|
|
```
|
|
|
|
2. Check if using unified system:
|
|
```cpp
|
|
input bool UseUnifiedSystem = true; // Required for NoConstraintsMode
|
|
```
|
|
|
|
**Solutions**:
|
|
- Ensure `UseUnifiedSystem=true`
|
|
- Restart EA after changing `NoConstraintsMode`
|
|
- Check logs for shadow decisions: `[SHADOW] gate=... result=block (but allowing)`
|
|
|
|
---
|
|
|
|
**See Also:**
|
|
- [Configuration-Reference.md](Configuration-Reference.md) - Policy and exploration parameters
|
|
- [Execution-Pipeline.md](Execution-Pipeline.md) - When policy/exploration evaluated
|
|
- [Observability-Guide.md](Observability-Guide.md) - Policy/exploration telemetry events
|