16 KiB
Policy & Exploration Guide — DualEA System
ML policy gating, fallback modes, and exploration system
Table of Contents
- Policy System Overview
- Policy Structure
- Policy Gating Logic
- Policy Fallback Modes
- Policy Scaling
- Exploration Mode
- Exploration Caps
- Troubleshooting
Policy System Overview
Purpose
The policy system allows ML models to influence trading decisions through:
- Confidence gating: Block low-confidence predictions
- Parameter scaling: Adjust SL/TP/lots based on model output
- Risk management: Dynamic position sizing based on predicted outcomes
Files
policy.json: ML-generated trading policy
- Path:
Common/Files/DualEA/policy.json - Generated by:
ML/policy_export.py - Structure: Per-slice (strategy|symbol|timeframe) probabilities and scaling multipliers
policy.backup.json: Last-known-good policy
- Used for rollback on parse/plausibility failures
- Updated on successful policy load
policy.reload: Trigger file for hot-reload
- PaperEA_v2: Hot-reload supported via
policy.reload+ HTTP polling - LiveEA: Load on init only (restart required for updates)
Policy Structure
JSON Schema
{
"version": "1.0",
"generated_at": "2025-01-15T10:00:00Z",
"model_hash": "abc123...",
"train_window": {"start": "2024-01-01", "end": "2025-01-15"},
"metrics": {
"roc_auc": 0.72,
"brier_score": 0.18,
"expected_r": 0.65
},
"min_confidence": 0.55,
"slices": [
{
"strategy": "ADXStrategy",
"symbol": "EURUSD",
"timeframe": 60,
"probability": 0.75,
"sl_mult": 1.0,
"tp_mult": 1.2,
"lot_mult": 1.0,
"trail_mult": 1.0
},
...
]
}
Fields
Global:
version: Schema versiongenerated_at: Timestamp of policy generationmodel_hash: Trained model identifier for provenancetrain_window: Data range used for trainingmetrics: Model performance metricsmin_confidence: Global minimum confidence threshold
Per-Slice (as implemented in LiveEA; PaperEA_v2 uses minimal parsing):
strategy: Strategy name (e.g., "ADXStrategy")symbol: Trading symbol (e.g., "EURUSD")timeframe: Timeframe in minutes (e.g., 60 for H1)p_win: ML model confidence (0.0-1.0)sl_scale: Stop loss scaling multiplier (optional)tp_scale: Take profit scaling multiplier (optional)trail_atr_mult: Trailing stop ATR multiplier (optional)confidence: Alternative confidence field (optional)
Policy Gating Logic
Configuration
input bool UsePolicyGating = true; // Enable ML policy gating
input bool DefaultPolicyFallback = true; // Enable fallback modes
input bool FallbackDemoOnly = true; // Restrict fallback to demo
input bool FallbackWhenNoPolicy = true; // Fallback if policy not loaded
input bool FallbackWhenSliceMissing = true; // Fallback if slice missing
Decision Flow
Is UsePolicyGating=true?
No → PASS (no policy gating)
Yes → Continue
Is policy loaded (slices > 0)?
No → Check FallbackWhenNoPolicy
Yes → Check FallbackDemoOnly
Demo account → FALLBACK (neutral scaling)
Live account → BLOCK
No → BLOCK (reason: "no policy loaded")
Yes → Continue
Does exact slice exist (strategy|symbol|TF)?
No → Check FallbackWhenSliceMissing
Yes → Check FallbackDemoOnly
Demo account → FALLBACK (neutral scaling)
Live account → BLOCK
No → BLOCK (reason: "policy slice missing")
Yes → Continue
Is slice.probability >= policy.min_confidence?
No → BLOCK (reason: "confidence too low")
Yes → PASS + Apply scaling
Code Example
bool ApplyPolicyGating(SignalData &signal, string &reason) {
if(!UsePolicyGating) return true;
// Check if policy loaded
if(policyData.slices == 0) {
return HandleFallback("no_policy", signal, reason);
}
// Find policy slice
PolicySlice slice = policyData.Find(
signal.strategy_name,
symbol,
timeframe
);
if(slice.probability < 0) { // Slice not found
return HandleFallback("slice_missing", signal, reason);
}
// Check confidence threshold
if(slice.probability < policyData.min_confidence) {
reason = StringFormat("confidence %.2f < %.2f",
slice.probability, policyData.min_confidence);
return false;
}
// Apply policy scaling
ApplyPolicyScaling(signal, slice);
return true;
}
Policy Fallback Modes
Fallback: No Policy
Triggers when:
UsePolicyGating=true- Policy file not loaded OR
policy.slices == 0 FallbackWhenNoPolicy=true
Behavior:
- Check
FallbackDemoOnly:- If
trueand demo account → ALLOW with neutral scaling - If
trueand live account → BLOCK - If
false→ ALLOW with neutral scaling (any account)
- If
- Bypasses insights thresholds
- Bypasses exploration caps
- No SL/TP/lot/trail multipliers applied
Log Example:
FALLBACK: no policy loaded -> neutral scaling used for ADXStrategy on EURUSD/H1 demo=true
Telemetry:
policy_fallback,EURUSD,60,ADXStrategy,no_policy,true,true
Fallback: Slice Missing
Triggers when:
UsePolicyGating=true- Policy IS loaded (
policy.slices > 0) - Exact strategy|symbol|timeframe slice not found in policy
FallbackWhenSliceMissing=true
Behavior:
- Same as "No Policy" fallback
- Allows trade with neutral scaling
- Demo-only restriction if
FallbackDemoOnly=true
Log Example:
FALLBACK: policy slice missing -> neutral scaling used for BollAverages on GBPUSD/H1 demo=true
Neutral Scaling
Definition: No adjustments applied, use base parameters.
sl_mult = 1.0
tp_mult = 1.0
lot_mult = 1.0
trail_mult = 1.0
Trade proceeds with:
- Original SL/TP from strategy
- Original lot size from risk calculation
- Original trailing stop settings (if enabled)
Fallback Safety
Why demo-only by default?
- Fallback is permissive (allows trades without ML validation)
- Intended for data collection, not production live trading
- In live, you want explicit policy coverage for all slices
When to disable FallbackDemoOnly?
- After verifying policy coverage is comprehensive
- When transitioning from demo to live with same strategy set
- With explicit risk acceptance of trading without ML guidance
Best Practice:
- Keep
FallbackDemoOnly=trueuntil policy proven - Monitor
policy_fallbacktelemetry events - Aim to eliminate fallbacks by improving policy coverage
Policy Scaling
Policy Scaling (LiveEA Only)
Note: Full policy scaling with per-slice multipliers is implemented in LiveEA. PaperEA_v2 has minimal policy parsing (checks
min_confidenceonly).
Scaling Application (LiveEA):
void ApplyPolicyScaling(SignalData &signal, PolicySlice &slice) {
// SL scaling
double slDistance = MathAbs(signal.entry_price - signal.stop_loss);
if(signal.direction == 1) { // Buy
signal.stop_loss = signal.entry_price - (slDistance * slice.sl_scale);
} else { // Sell
signal.stop_loss = signal.entry_price + (slDistance * slice.sl_scale);
}
// TP scaling
double tpDistance = MathAbs(signal.take_profit - signal.entry_price);
if(signal.direction == 1) { // Buy
signal.take_profit = signal.entry_price + (tpDistance * slice.tp_scale);
} else { // Sell
signal.take_profit = signal.entry_price - (tpDistance * slice.tp_scale);
}
// Trailing scaling (if enabled)
if(TrailEnabled && slice.trail_atr_mult > 0) {
// Apply trail ATR multiplier
}
}
Scaling Examples
Conservative Scaling (low confidence):
{
"probability": 0.60,
"sl_mult": 0.8, // Tighter stop
"tp_mult": 1.5, // Wider target (better RR)
"lot_mult": 0.5, // Smaller position
"trail_mult": 0.9 // Tighter trail
}
Aggressive Scaling (high confidence):
{
"probability": 0.85,
"sl_mult": 1.2, // Wider stop (more breathing room)
"tp_mult": 0.8, // Closer target (take profits faster)
"lot_mult": 1.5, // Larger position
"trail_mult": 1.0 // Normal trail
}
Neutral Scaling (fallback):
{
"probability": 0.0,
"sl_mult": 1.0,
"tp_mult": 1.0,
"lot_mult": 1.0,
"trail_mult": 1.0
}
Exploration Mode
Purpose
Bootstrap insights for strategy|symbol|timeframe slices that lack sufficient historical data.
Without exploration:
- New slices blocked by insights gating (no data → can't pass thresholds)
- Chicken-and-egg problem: can't trade → can't collect data → can't build insights
With exploration:
- Limited trades allowed for no-data slices
- Caps prevent excessive exposure
- Data collected → insights built → insights gating takes over
When Exploration Triggers
Conditions:
UseExploration=true- Insights gating is enabled (
UseInsightsGating=true) - No slice exists in insights.json for this strategy|symbol|timeframe
- Exploration caps not exceeded
Important: Exploration bypass ONLY when slice truly missing. If slice exists but fails thresholds (low win rate), it is BLOCKED (no bypass).
Configuration
input bool ExploreOnNoSlice = true; // Enable exploration when no slice exists
input int ExploreMaxPerSlicePerDay = 100; // Daily cap per slice (default 100)
input int ExploreMaxPerSlice = 100; // Weekly cap per slice (default 100)
Note: Previous documentation listed defaults of 2/3. Actual code defaults are 100/100, effectively unlimited for most practical purposes. The
UseExplorationinput does not exist; exploration is controlled viaExploreOnNoSlice.
Exploration Caps
Cap Types
Daily Cap: ExploreMaxPerSlicePerDay
- Resets at midnight (00:00 server time)
- Per-slice basis (each strategy|symbol|TF tracked separately)
- Default: 100 trades/day/slice
Weekly Cap: ExploreMaxPerSlice
- Resets on Monday
- Week bucket = Monday of the week (yyyymmdd format)
- Default: 100 trades/week/slice
Counter Persistence
Files:
Common/Files/DualEA/explore_counts_day.csvCommon/Files/DualEA/explore_counts.csv
Format:
key,date_yyyymmdd,count
ADXStrategy|EURUSD|60,20250115,2
BollAverages|GBPUSD|60,20250115,1
For weekly:
key,week_monday_yyyymmdd,count
ADXStrategy|EURUSD|60,20250113,3
Cap Checking
bool CheckExplorationCaps(string strategy, string symbol, int tf, string &reason) {
string sliceKey = strategy + "|" + symbol + "|" + IntegerToString(tf);
// Load counters
int dayCount = LoadExploreCountDay(sliceKey);
int weekCount = LoadExploreCountWeek(sliceKey);
// Check daily cap
if(ExploreMaxPerSlicePerDay > 0 && dayCount >= ExploreMaxPerSlicePerDay) {
reason = StringFormat("explore_cap_day (day=%d/%d, week=%d/%d)",
dayCount, ExploreMaxPerSlicePerDay,
weekCount, ExploreMaxPerSlice);
telemetry.Event("explore_block_day", sliceKey, dayCount, weekCount);
return false;
}
// Check weekly cap
if(ExploreMaxPerSlice > 0 && weekCount >= ExploreMaxPerSlice) {
reason = StringFormat("explore_cap_week (day=%d/%d, week=%d/%d)",
dayCount, ExploreMaxPerSlicePerDay,
weekCount, ExploreMaxPerSlice);
telemetry.Event("explore_block_week", sliceKey, dayCount, weekCount);
return false;
}
// Increment counters
IncrementExploreCountDay(sliceKey);
IncrementExploreCountWeek(sliceKey);
// Allow exploration
Log(StringFormat("GATE: explore allow %s on %s/%d (day=%d/%d, week=%d/%d)",
strategy, symbol, tf,
dayCount+1, ExploreMaxPerSlicePerDay,
weekCount+1, ExploreMaxPerSlice));
telemetry.Event("explore_allow", sliceKey, dayCount+1, weekCount+1);
return true;
}
Resetting Caps
Manual Reset: Delete counter files:
Remove-Item "C:\Users\<you>\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\explore_counts*.csv"
Automatic Reset:
- Daily: Midnight (00:00 server time)
- Weekly: Monday 00:00
Interaction with NoConstraintsMode
When NoConstraintsMode=true:
- Insights gating: BYPASSED
- Exploration caps: BYPASSED (trades allowed regardless of caps)
- Exploration counters: Still incremented for telemetry
Troubleshooting
Issue: Policy fallback always triggering
Symptoms:
FALLBACK: no policy loaded -> neutral scaling for ADXStrategy on EURUSD/H1 demo=true
Diagnosis:
-
Check if
policy.jsonexists:Test-Path "C:\Users\<you>\AppData\Roaming\MetaQuotes\Terminal\Common\Files\DualEA\policy.json" -
Check policy load logs (enable
DebugPolicy=true):Policy loaded: 0 slices, min_confidence=0.00 -
Verify policy.json content:
Get-Content "...\DualEA\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices | Measure-Object
Solutions:
- Run
ML/policy_export.pyto generate policy - Copy policy.json to Common Files
- Verify JSON is well-formed (no parse errors)
Issue: Policy slice missing fallback
Symptoms:
FALLBACK: policy slice missing -> neutral scaling for BollAverages on GBPUSD/H1 demo=true
Diagnosis:
-
Check policy slices:
Get-Content "...\policy.json" | ConvertFrom-Json | Select-Object -ExpandProperty slices | Where-Object {$_.strategy -eq "BollAverages" -and $_.symbol -eq "GBPUSD" -and $_.timeframe -eq 60} -
No result → slice missing from policy
Solutions:
- Collect more training data for this slice
- Re-train model with broader coverage
- Accept fallback for new slices (data collection mode)
Issue: Exploration caps reached immediately
Symptoms:
GATE: blocked ADXStrategy on EURUSD/H1 reason=explore_cap_day (day=2/2, week=2/3)
Diagnosis:
-
Check counters:
Import-Csv "...\explore_counts_day.csv" | Where-Object {$_.key -like "*ADXStrategy|EURUSD|60*"} -
Verify caps not too restrictive:
ExploreMaxPerSlicePerDay = 2 // Very conservative
Solutions:
- Increase caps (e.g., 5/10 for PaperEA, 1/2 for LiveEA)
- Reset counters manually (delete CSV files)
- Use
NoConstraintsMode=truefor initial data collection (bypasses all caps)
Issue: Exploration not triggering
Symptoms:
GATE: blocked ADXStrategy on EURUSD/H1 reason=below_winrate (WR=0.42 < 0.50)
(Slice exists but fails threshold, no exploration bypass)
Diagnosis: Exploration only triggers when slice missing entirely, not when slice exists but fails thresholds.
Solutions:
- This is correct behavior (no-slice-only bypass)
- To allow trading despite low performance:
- Lower insights thresholds temporarily
- Use
NoConstraintsMode=true - Delete slice from insights.json to force exploration
- Wait for performance to improve
Issue: NoConstraintsMode not working
Symptoms:
Still seeing gate blocks despite NoConstraintsMode=true
Diagnosis:
-
Verify setting applied:
if(NoConstraintsMode) Print("NoConstraintsMode is TRUE"); -
Check if using unified system:
input bool UseUnifiedSystem = true; // Required for NoConstraintsMode
Solutions:
- Ensure
UseUnifiedSystem=true - Restart EA after changing
NoConstraintsMode - Check logs for shadow decisions:
[SHADOW] gate=... result=block (but allowing)
See Also:
- Configuration-Reference.md - Policy and exploration parameters
- Execution-Pipeline.md - When policy/exploration evaluated
- Observability-Guide.md - Policy/exploration telemetry events