# Policy & Exploration Guide — DualEA System **ML policy gating, fallback modes, and exploration system** --- ## Table of Contents 1. [Policy System Overview](#policy-system-overview) 2. [Policy Structure](#policy-structure) 3. [Policy Gating Logic](#policy-gating-logic) 4. [Policy Fallback Modes](#policy-fallback-modes) 5. [Policy Scaling](#policy-scaling) 6. [Exploration Mode](#exploration-mode) 7. [Exploration Caps](#exploration-caps) 8. [Troubleshooting](#troubleshooting) --- ## Policy System Overview ### Purpose The **policy system** allows ML models to influence trading decisions through: - **Confidence gating**: Block low-confidence predictions - **Parameter scaling**: Adjust SL/TP/lots based on model output - **Risk management**: Dynamic position sizing based on predicted outcomes ### Files **policy.json**: ML-generated trading policy - Path: `Common/Files/DualEA/policy.json` - Generated by: `ML/policy_export.py` - Structure: Per-slice (strategy|symbol|timeframe) probabilities and scaling multipliers **policy.backup.json**: Last-known-good policy - Used for rollback on parse/plausibility failures - Updated on successful policy load **policy.reload**: Trigger file for hot-reload - PaperEA_v2: Hot-reload supported via `policy.reload` + HTTP polling - LiveEA: Load on init only (restart required for updates) --- ## Policy Structure ### JSON Schema ```json { "version": "1.0", "generated_at": "2025-01-15T10:00:00Z", "model_hash": "abc123...", "train_window": {"start": "2024-01-01", "end": "2025-01-15"}, "metrics": { "roc_auc": 0.72, "brier_score": 0.18, "expected_r": 0.65 }, "min_confidence": 0.55, "slices": [ { "strategy": "ADXStrategy", "symbol": "EURUSD", "timeframe": 60, "probability": 0.75, "sl_mult": 1.0, "tp_mult": 1.2, "lot_mult": 1.0, "trail_mult": 1.0 }, ... ] } ``` ### Fields **Global**: - `version`: Schema version - `generated_at`: Timestamp of policy generation - `model_hash`: Trained model identifier for provenance - `train_window`: Data range used for training - `metrics`: Model performance metrics - `min_confidence`: Global minimum confidence threshold **Per-Slice** (as implemented in LiveEA; PaperEA_v2 uses minimal parsing): - `strategy`: Strategy name (e.g., "ADXStrategy") - `symbol`: Trading symbol (e.g., "EURUSD") - `timeframe`: Timeframe in minutes (e.g., 60 for H1) - `p_win`: ML model confidence (0.0-1.0) - `sl_scale`: Stop loss scaling multiplier (optional) - `tp_scale`: Take profit scaling multiplier (optional) - `trail_atr_mult`: Trailing stop ATR multiplier (optional) - `confidence`: Alternative confidence field (optional) --- ## Policy Gating Logic ### Configuration ```cpp input bool UsePolicyGating = true; // Enable ML policy gating input bool DefaultPolicyFallback = true; // Enable fallback modes input bool FallbackDemoOnly = true; // Restrict fallback to demo input bool FallbackWhenNoPolicy = true; // Fallback if policy not loaded input bool FallbackWhenSliceMissing = true; // Fallback if slice missing ``` ### Decision Flow ``` Is UsePolicyGating=true? No → PASS (no policy gating) Yes → Continue Is policy loaded (slices > 0)? No → Check FallbackWhenNoPolicy Yes → Check FallbackDemoOnly Demo account → FALLBACK (neutral scaling) Live account → BLOCK No → BLOCK (reason: "no policy loaded") Yes → Continue Does exact slice exist (strategy|symbol|TF)? No → Check FallbackWhenSliceMissing Yes → Check FallbackDemoOnly Demo account → FALLBACK (neutral scaling) Live account → BLOCK No → BLOCK (reason: "policy slice missing") Yes → Continue Is slice.probability >= policy.min_confidence? No → BLOCK (reason: "confidence too low") Yes → PASS + Apply scaling ``` ### Code Example ```cpp bool ApplyPolicyGating(SignalData &signal, string &reason) { if(!UsePolicyGating) return true; // Check if policy loaded if(policyData.slices == 0) { return HandleFallback("no_policy", signal, reason); } // Find policy slice PolicySlice slice = policyData.Find( signal.strategy_name, symbol, timeframe ); if(slice.probability < 0) { // Slice not found return HandleFallback("slice_missing", signal, reason); } // Check confidence threshold if(slice.probability < policyData.min_confidence) { reason = StringFormat("confidence %.2f < %.2f", slice.probability, policyData.min_confidence); return false; } // Apply policy scaling ApplyPolicyScaling(signal, slice); return true; } ``` --- ## Policy Fallback Modes ### Fallback: No Policy **Triggers when**: - `UsePolicyGating=true` - Policy file not loaded OR `policy.slices == 0` - `FallbackWhenNoPolicy=true` **Behavior**: - Check `FallbackDemoOnly`: - If `true` and demo account → ALLOW with neutral scaling - If `true` and live account → BLOCK - If `false` → ALLOW with neutral scaling (any account) - Bypasses insights thresholds - Bypasses exploration caps - No SL/TP/lot/trail multipliers applied **Log Example**: ``` FALLBACK: no policy loaded -> neutral scaling used for ADXStrategy on EURUSD/H1 demo=true ``` **Telemetry**: ```csv policy_fallback,EURUSD,60,ADXStrategy,no_policy,true,true ``` ### Fallback: Slice Missing **Triggers when**: - `UsePolicyGating=true` - Policy IS loaded (`policy.slices > 0`) - Exact strategy|symbol|timeframe slice not found in policy - `FallbackWhenSliceMissing=true` **Behavior**: - Same as "No Policy" fallback - Allows trade with neutral scaling - Demo-only restriction if `FallbackDemoOnly=true` **Log Example**: ``` FALLBACK: policy slice missing -> neutral scaling used for BollAverages on GBPUSD/H1 demo=true ``` ### Neutral Scaling **Definition**: No adjustments applied, use base parameters. ```cpp sl_mult = 1.0 tp_mult = 1.0 lot_mult = 1.0 trail_mult = 1.0 ``` Trade proceeds with: - Original SL/TP from strategy - Original lot size from risk calculation - Original trailing stop settings (if enabled) ### Fallback Safety **Why demo-only by default?** - Fallback is permissive (allows trades without ML validation) - Intended for data collection, not production live trading - In live, you want explicit policy coverage for all slices **When to disable FallbackDemoOnly?** - After verifying policy coverage is comprehensive - When transitioning from demo to live with same strategy set - With explicit risk acceptance of trading without ML guidance **Best Practice**: - Keep `FallbackDemoOnly=true` until policy proven - Monitor `policy_fallback` telemetry events - Aim to eliminate fallbacks by improving policy coverage --- ## Policy Scaling ### Policy Scaling (LiveEA Only) > **Note:** Full policy scaling with per-slice multipliers is implemented in LiveEA. PaperEA_v2 has minimal policy parsing (checks `min_confidence` only). **Scaling Application (LiveEA):** ```cpp void ApplyPolicyScaling(SignalData &signal, PolicySlice &slice) { // SL scaling double slDistance = MathAbs(signal.entry_price - signal.stop_loss); if(signal.direction == 1) { // Buy signal.stop_loss = signal.entry_price - (slDistance * slice.sl_scale); } else { // Sell signal.stop_loss = signal.entry_price + (slDistance * slice.sl_scale); } // TP scaling double tpDistance = MathAbs(signal.take_profit - signal.entry_price); if(signal.direction == 1) { // Buy signal.take_profit = signal.entry_price + (tpDistance * slice.tp_scale); } else { // Sell signal.take_profit = signal.entry_price - (tpDistance * slice.tp_scale); } // Trailing scaling (if enabled) if(TrailEnabled && slice.trail_atr_mult > 0) { // Apply trail ATR multiplier } } ``` ### Scaling Examples **Conservative Scaling** (low confidence): ```json { "probability": 0.60, "sl_mult": 0.8, // Tighter stop "tp_mult": 1.5, // Wider target (better RR) "lot_mult": 0.5, // Smaller position "trail_mult": 0.9 // Tighter trail } ``` **Aggressive Scaling** (high confidence): ```json { "probability": 0.85, "sl_mult": 1.2, // Wider stop (more breathing room) "tp_mult": 0.8, // Closer target (take profits faster) "lot_mult": 1.5, // Larger position "trail_mult": 1.0 // Normal trail } ``` **Neutral Scaling** (fallback): ```json { "probability": 0.0, "sl_mult": 1.0, "tp_mult": 1.0, "lot_mult": 1.0, "trail_mult": 1.0 } ``` --- ## Exploration Mode ### Purpose **Bootstrap insights** for strategy|symbol|timeframe slices that lack sufficient historical data. Without exploration: - New slices blocked by insights gating (no data → can't pass thresholds) - Chicken-and-egg problem: can't trade → can't collect data → can't build insights With exploration: - Limited trades allowed for no-data slices - Caps prevent excessive exposure - Data collected → insights built → insights gating takes over ### When Exploration Triggers **Conditions**: 1. `UseExploration=true` 2. Insights gating is enabled (`UseInsightsGating=true`) 3. No slice exists in insights.json for this strategy|symbol|timeframe 4. Exploration caps not exceeded **Important**: Exploration bypass ONLY when slice truly missing. If slice exists but fails thresholds (low win rate), it is BLOCKED (no bypass). ### Configuration ```cpp input bool ExploreOnNoSlice = true; // Enable exploration when no slice exists input int ExploreMaxPerSlicePerDay = 100; // Daily cap per slice (default 100) input int ExploreMaxPerSlice = 100; // Weekly cap per slice (default 100) ``` > **Note:** Previous documentation listed defaults of 2/3. Actual code defaults are 100/100, effectively unlimited for most practical purposes. The `UseExploration` input does not exist; exploration is controlled via `ExploreOnNoSlice`. --- ## Exploration Caps ### Cap Types **Daily Cap**: `ExploreMaxPerSlicePerDay` - Resets at midnight (00:00 server time) - Per-slice basis (each strategy|symbol|TF tracked separately) - Default: 100 trades/day/slice **Weekly Cap**: `ExploreMaxPerSlice` - Resets on Monday - Week bucket = Monday of the week (yyyymmdd format) - Default: 100 trades/week/slice ### Counter Persistence **Files**: - `Common/Files/DualEA/explore_counts_day.csv` - `Common/Files/DualEA/explore_counts.csv` **Format**: ```csv key,date_yyyymmdd,count ADXStrategy|EURUSD|60,20250115,2 BollAverages|GBPUSD|60,20250115,1 ``` For weekly: ```csv key,week_monday_yyyymmdd,count ADXStrategy|EURUSD|60,20250113,3 ``` ### Cap Checking ```cpp bool CheckExplorationCaps(string strategy, string symbol, int tf, string &reason) { string sliceKey = strategy + "|" + symbol + "|" + IntegerToString(tf); // Load counters int dayCount = LoadExploreCountDay(sliceKey); int weekCount = LoadExploreCountWeek(sliceKey); // Check daily cap if(ExploreMaxPerSlicePerDay > 0 && dayCount >= ExploreMaxPerSlicePerDay) { reason = StringFormat("explore_cap_day (day=%d/%d, week=%d/%d)", dayCount, ExploreMaxPerSlicePerDay, weekCount, ExploreMaxPerSlice); telemetry.Event("explore_block_day", sliceKey, dayCount, weekCount); return false; } // Check weekly cap if(ExploreMaxPerSlice > 0 && weekCount >= ExploreMaxPerSlice) { reason = StringFormat("explore_cap_week (day=%d/%d, week=%d/%d)", dayCount, ExploreMaxPerSlicePerDay, weekCount, ExploreMaxPerSlice); telemetry.Event("explore_block_week", sliceKey, dayCount, weekCount); return false; } // Increment counters IncrementExploreCountDay(sliceKey); IncrementExploreCountWeek(sliceKey); // Allow exploration Log(StringFormat("GATE: explore allow %s on %s/%d (day=%d/%d, week=%d/%d)", strategy, symbol, tf, dayCount+1, ExploreMaxPerSlicePerDay, weekCount+1, ExploreMaxPerSlice)); telemetry.Event("explore_allow", sliceKey, dayCount+1, weekCount+1); return true; } ``` ### Resetting Caps **Manual Reset**: Delete counter files: ```powershell Remove-Item "C:\Users\\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\explore_counts*.csv" ``` **Automatic Reset**: - Daily: Midnight (00:00 server time) - Weekly: Monday 00:00 ### Interaction with NoConstraintsMode When `NoConstraintsMode=true`: - **Insights gating**: BYPASSED - **Exploration caps**: BYPASSED (trades allowed regardless of caps) - **Exploration counters**: Still incremented for telemetry --- ## Troubleshooting ### Issue: Policy fallback always triggering **Symptoms**: ``` FALLBACK: no policy loaded -> neutral scaling for ADXStrategy on EURUSD/H1 demo=true ``` **Diagnosis**: 1. Check if `policy.json` exists: ```powershell Test-Path "C:\Users\\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\policy.json" ``` 2. Check policy load logs (enable `DebugPolicy=true`): ``` Policy loaded: 0 slices, min_confidence=0.00 ``` 3. Verify policy.json content: ```powershell Get-Content "...\DualEA\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices | Measure-Object ``` **Solutions**: - Run `ML/policy_export.py` to generate policy - Copy policy.json to Common Files - Verify JSON is well-formed (no parse errors) ### Issue: Policy slice missing fallback **Symptoms**: ``` FALLBACK: policy slice missing -> neutral scaling for BollAverages on GBPUSD/H1 demo=true ``` **Diagnosis**: 1. Check policy slices: ```powershell Get-Content "...\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices | Where-Object {$_.strategy -eq "BollAverages" -and $_.symbol -eq "GBPUSD" -and $_.timeframe -eq 60} ``` 2. No result → slice missing from policy **Solutions**: - Collect more training data for this slice - Re-train model with broader coverage - Accept fallback for new slices (data collection mode) ### Issue: Exploration caps reached immediately **Symptoms**: ``` GATE: blocked ADXStrategy on EURUSD/H1 reason=explore_cap_day (day=2/2, week=2/3) ``` **Diagnosis**: 1. Check counters: ```powershell Import-Csv "...\explore_counts_day.csv" | Where-Object {$_.key -like "*ADXStrategy|EURUSD|60*"} ``` 2. Verify caps not too restrictive: ```cpp ExploreMaxPerSlicePerDay = 2 // Very conservative ``` **Solutions**: - Increase caps (e.g., 5/10 for PaperEA, 1/2 for LiveEA) - Reset counters manually (delete CSV files) - Use `NoConstraintsMode=true` for initial data collection (bypasses all caps) ### Issue: Exploration not triggering **Symptoms**: ``` GATE: blocked ADXStrategy on EURUSD/H1 reason=below_winrate (WR=0.42 < 0.50) ``` (Slice exists but fails threshold, no exploration bypass) **Diagnosis**: Exploration only triggers when **slice missing entirely**, not when slice exists but fails thresholds. **Solutions**: - This is correct behavior (no-slice-only bypass) - To allow trading despite low performance: - Lower insights thresholds temporarily - Use `NoConstraintsMode=true` - Delete slice from insights.json to force exploration - Wait for performance to improve ### Issue: NoConstraintsMode not working **Symptoms**: Still seeing gate blocks despite `NoConstraintsMode=true` **Diagnosis**: 1. Verify setting applied: ```cpp if(NoConstraintsMode) Print("NoConstraintsMode is TRUE"); ``` 2. Check if using unified system: ```cpp input bool UseUnifiedSystem = true; // Required for NoConstraintsMode ``` **Solutions**: - Ensure `UseUnifiedSystem=true` - Restart EA after changing `NoConstraintsMode` - Check logs for shadow decisions: `[SHADOW] gate=... result=block (but allowing)` --- **See Also:** - [Configuration-Reference.md](Configuration-Reference.md) - Policy and exploration parameters - [Execution-Pipeline.md](Execution-Pipeline.md) - When policy/exploration evaluated - [Observability-Guide.md](Observability-Guide.md) - Policy/exploration telemetry events