138 lines
6.3 KiB
Markdown
138 lines
6.3 KiB
Markdown
|
|
# Article-22063-Alternative-Bars-For-Market-Intent
|
||
|
|
|
||
|
|
This repository is an article-derived reference project based on the original MQL5 article. It does not claim to reproduce the full original source code unless files are explicitly attached.
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
Reference repository for the MQL5 article on alternative market data sampling methods inspired by Chapter 1 of *Advances in Financial Machine Learning*. The article presents a dual implementation:
|
||
|
|
|
||
|
|
- a Python batch-processing pipeline for multi-year tick histories
|
||
|
|
- an MQL5 object-oriented library for live tick-by-tick bar construction inside an Expert Advisor
|
||
|
|
|
||
|
|
The covered bar families include standard bars and imbalance-based information bars, with emphasis on data cleaning, scalable storage/loading, adaptive threshold calibration, and parity verification between Python and MQL5 outputs.
|
||
|
|
|
||
|
|
## Original Article
|
||
|
|
|
||
|
|
- **Article ID:** 22063
|
||
|
|
- **Author:** Patrick Murimi Njoroge
|
||
|
|
- **Publication date:** 2026.05.08
|
||
|
|
- **Category:** Machine learning
|
||
|
|
- **URL:** https://www.mql5.com/en/articles/22063
|
||
|
|
|
||
|
|
## Repository Purpose
|
||
|
|
|
||
|
|
This repository should be treated as a technical reference/reconstruction of the article’s described architecture for alternative bar construction.
|
||
|
|
|
||
|
|
Its purpose is to document and organize:
|
||
|
|
|
||
|
|
- standard bar construction: time, tick, volume, dollar
|
||
|
|
- information bar construction: tick imbalance, volume imbalance, dollar imbalance
|
||
|
|
- preprocessing of tick streams before bar generation
|
||
|
|
- scalable Python-side storage/loading using Parquet and Dask
|
||
|
|
- MQL5 live bar construction with restart-safe state persistence
|
||
|
|
- Python/MQL5 parity validation concepts
|
||
|
|
|
||
|
|
## Key Concepts
|
||
|
|
|
||
|
|
- Clock-based sampling can inject heteroscedasticity by forcing equal time windows with unequal information content.
|
||
|
|
- Activity-based bars close on fixed market activity instead of elapsed time.
|
||
|
|
- Information bars close on directional imbalance, not only raw activity.
|
||
|
|
- Tick-rule classification assigns direction using price changes and carry-forward logic on unchanged prices.
|
||
|
|
- Imbalance thresholds are updated using incremental exponentially weighted means.
|
||
|
|
- Time bars may generate phantom zero-tick bars across inactive periods and require explicit filtering.
|
||
|
|
- Live MQL5 implementations must preserve state across terminal restarts to avoid threshold resets.
|
||
|
|
- Parquet partitioning and Dask loading are used to avoid loading full multi-year tick datasets into memory.
|
||
|
|
|
||
|
|
## Algorithm / Architecture Summary
|
||
|
|
|
||
|
|
The article describes the following processing flow:
|
||
|
|
|
||
|
|
1. **Tick storage**
|
||
|
|
- Raw data is stored in partitioned Parquet files by symbol/year/month.
|
||
|
|
- Compression is performed with PyArrow using `zstd`.
|
||
|
|
|
||
|
|
2. **Data loading**
|
||
|
|
- Python uses Dask `read_parquet()` with date filters.
|
||
|
|
- Only relevant partitions are materialized into pandas.
|
||
|
|
|
||
|
|
3. **Tick cleaning**
|
||
|
|
- Ensure `DatetimeIndex`
|
||
|
|
- Normalize timezone
|
||
|
|
- Remove invalid prices and non-positive spreads
|
||
|
|
- Drop NaN/NaT-related issues
|
||
|
|
- Remove duplicate timestamps with `keep="last"`
|
||
|
|
- Sort chronologically
|
||
|
|
|
||
|
|
4. **Standard bars**
|
||
|
|
- Time bars via resampling
|
||
|
|
- Tick bars via fixed tick counts
|
||
|
|
- Volume bars via cumulative volume thresholds
|
||
|
|
- Dollar bars via cumulative price×volume thresholds
|
||
|
|
- Time bars additionally filter zero `tick_volume` rows
|
||
|
|
|
||
|
|
5. **Information bars**
|
||
|
|
- Tick rule computes directional sign
|
||
|
|
- Signed metric is accumulated as imbalance
|
||
|
|
- A bar closes when absolute imbalance exceeds an adaptive threshold
|
||
|
|
- The threshold depends on EWM estimates of:
|
||
|
|
- expected bar length
|
||
|
|
- expected absolute imbalance per bar
|
||
|
|
|
||
|
|
6. **Unified Python API**
|
||
|
|
- `make_bars()` dispatches to the requested bar type
|
||
|
|
- Information bars can be auto-calibrated using `target_timeframe`
|
||
|
|
- Explicit seeds (`exp_ticks_init`, `exp_imbalance_init`) remain available
|
||
|
|
|
||
|
|
7. **MQL5 runtime design**
|
||
|
|
- Abstract base class for common OHLC/volume/spread accumulation
|
||
|
|
- Derived classes implement close semantics for each bar family
|
||
|
|
- Example EA processes ticks in `OnTick()`
|
||
|
|
- CSV append output is used for live logging
|
||
|
|
- State persistence is used for restart recovery, especially for imbalance bars
|
||
|
|
|
||
|
|
8. **Parity testing**
|
||
|
|
- Python and MQL5 bars are aligned by `tick_num`
|
||
|
|
- Comparison checks OHLC and volume fields for exact or near-exact equality
|
||
|
|
|
||
|
|
## Mentioned or Attached Files
|
||
|
|
|
||
|
|
### Explicitly attached files
|
||
|
|
|
||
|
|
- `AlternativeBars\CBarConstructor.mqh` — abstract base class, `SBar` struct, persistence interface
|
||
|
|
- `AlternativeBars\CStandardBars.mqh` — standard bar classes: time, tick, volume, dollar
|
||
|
|
- `AlternativeBars\CImbalanceBars.mqh` — tick rule and imbalance bar implementation with EWM threshold state
|
||
|
|
- `BarBuilderEA.mq5` — example Expert Advisor for live bar construction and CSV/state handling
|
||
|
|
|
||
|
|
### Files mentioned in the article text
|
||
|
|
|
||
|
|
- `afml/data_structures/bars.py` — unified Python `make_bars()` entry point
|
||
|
|
- `afml/data_structures/information_bars.py` — JIT-compiled information-bar boundary detection
|
||
|
|
- `afml/data_structures/calibration.py` — automatic calibration for imbalance-bar initialization
|
||
|
|
|
||
|
|
## Statistics
|
||
|
|
|
||
|
|
- **Bar types discussed:** 10 total in Python
|
||
|
|
- **Bar types mirrored in MQL5:** 7
|
||
|
|
- **MQL5 header files described:** 3
|
||
|
|
- **Example Expert Advisor described:** 1
|
||
|
|
- **Main pipeline stages highlighted:** storage, loading, cleaning, bar construction, calibration, persistence, parity verification
|
||
|
|
|
||
|
|
## Tags
|
||
|
|
|
||
|
|
`mql5` `python` `machine-learning` `tick-data` `alternative-bars` `time-bars` `tick-bars` `volume-bars` `dollar-bars` `imbalance-bars` `parquet` `dask` `numba`
|
||
|
|
|
||
|
|
## Difficulty
|
||
|
|
|
||
|
|
Advanced
|
||
|
|
|
||
|
|
## Limitations
|
||
|
|
|
||
|
|
- This repository is based on article analysis and attached-file descriptions; the full original source tree is not fully reproduced here unless the listed files are actually present in the repository.
|
||
|
|
- The article describes both Python and MQL5 implementations, but only some file paths are explicitly listed as attached.
|
||
|
|
- Python package structure, auxiliary utilities, and configuration files may be incomplete or absent.
|
||
|
|
- Installation, build, and execution steps should not be assumed beyond what the article explicitly states.
|
||
|
|
- If the repository does not contain the attached files listed above, then the processed input should be treated as documentation-only reconstruction.
|
||
|
|
|
||
|
|
## Reference
|
||
|
|
|
||
|
|
- Patrick Murimi Njoroge, [“Beyond the Clock (Part 1): Building Activity and Imbalance Bars in Python and MQL5”](https://www.mql5.com/en/articles/22063) MQL5 article 22063, 2026.05.08
|