Article-22063-Alternative-B.../README.md
PatrickNjoroge743 bffab07776 new files added
2026-06-09 22:26:00 +03:00

6.3 KiB

Article-22063-Alternative-Bars-For-Market-Intent

This repository is an article-derived reference project based on the original MQL5 article. It does not claim to reproduce the full original source code unless files are explicitly attached.

Overview

Reference repository for the MQL5 article on alternative market data sampling methods inspired by Chapter 1 of Advances in Financial Machine Learning. The article presents a dual implementation:

  • a Python batch-processing pipeline for multi-year tick histories
  • an MQL5 object-oriented library for live tick-by-tick bar construction inside an Expert Advisor

The covered bar families include standard bars and imbalance-based information bars, with emphasis on data cleaning, scalable storage/loading, adaptive threshold calibration, and parity verification between Python and MQL5 outputs.

Original Article

Repository Purpose

This repository should be treated as a technical reference/reconstruction of the article’s described architecture for alternative bar construction.

Its purpose is to document and organize:

  • standard bar construction: time, tick, volume, dollar
  • information bar construction: tick imbalance, volume imbalance, dollar imbalance
  • preprocessing of tick streams before bar generation
  • scalable Python-side storage/loading using Parquet and Dask
  • MQL5 live bar construction with restart-safe state persistence
  • Python/MQL5 parity validation concepts

Key Concepts

  • Clock-based sampling can inject heteroscedasticity by forcing equal time windows with unequal information content.
  • Activity-based bars close on fixed market activity instead of elapsed time.
  • Information bars close on directional imbalance, not only raw activity.
  • Tick-rule classification assigns direction using price changes and carry-forward logic on unchanged prices.
  • Imbalance thresholds are updated using incremental exponentially weighted means.
  • Time bars may generate phantom zero-tick bars across inactive periods and require explicit filtering.
  • Live MQL5 implementations must preserve state across terminal restarts to avoid threshold resets.
  • Parquet partitioning and Dask loading are used to avoid loading full multi-year tick datasets into memory.

Algorithm / Architecture Summary

The article describes the following processing flow:

  1. Tick storage
  • Raw data is stored in partitioned Parquet files by symbol/year/month.
  • Compression is performed with PyArrow using zstd.
  1. Data loading
  • Python uses Dask read_parquet() with date filters.
  • Only relevant partitions are materialized into pandas.
  1. Tick cleaning
  • Ensure DatetimeIndex
  • Normalize timezone
  • Remove invalid prices and non-positive spreads
  • Drop NaN/NaT-related issues
  • Remove duplicate timestamps with keep="last"
  • Sort chronologically
  1. Standard bars
  • Time bars via resampling
  • Tick bars via fixed tick counts
  • Volume bars via cumulative volume thresholds
  • Dollar bars via cumulative price×volume thresholds
  • Time bars additionally filter zero tick_volume rows
  1. Information bars
  • Tick rule computes directional sign
  • Signed metric is accumulated as imbalance
  • A bar closes when absolute imbalance exceeds an adaptive threshold
  • The threshold depends on EWM estimates of:
  • expected bar length
  • expected absolute imbalance per bar
  1. Unified Python API
  • make_bars() dispatches to the requested bar type
  • Information bars can be auto-calibrated using target_timeframe
  • Explicit seeds (exp_ticks_init, exp_imbalance_init) remain available
  1. MQL5 runtime design
  • Abstract base class for common OHLC/volume/spread accumulation
  • Derived classes implement close semantics for each bar family
  • Example EA processes ticks in OnTick()
  • CSV append output is used for live logging
  • State persistence is used for restart recovery, especially for imbalance bars
  1. Parity testing
  • Python and MQL5 bars are aligned by tick_num
  • Comparison checks OHLC and volume fields for exact or near-exact equality

Mentioned or Attached Files

Explicitly attached files

  • AlternativeBars\CBarConstructor.mqh — abstract base class, SBar struct, persistence interface
  • AlternativeBars\CStandardBars.mqh — standard bar classes: time, tick, volume, dollar
  • AlternativeBars\CImbalanceBars.mqh — tick rule and imbalance bar implementation with EWM threshold state
  • BarBuilderEA.mq5 — example Expert Advisor for live bar construction and CSV/state handling

Files mentioned in the article text

  • afml/data_structures/bars.py — unified Python make_bars() entry point
  • afml/data_structures/information_bars.py — JIT-compiled information-bar boundary detection
  • afml/data_structures/calibration.py — automatic calibration for imbalance-bar initialization

Statistics

  • Bar types discussed: 10 total in Python
  • Bar types mirrored in MQL5: 7
  • MQL5 header files described: 3
  • Example Expert Advisor described: 1
  • Main pipeline stages highlighted: storage, loading, cleaning, bar construction, calibration, persistence, parity verification

Tags

mql5 python machine-learning tick-data alternative-bars time-bars tick-bars volume-bars dollar-bars imbalance-bars parquet dask numba

Difficulty

Advanced

Limitations

  • This repository is based on article analysis and attached-file descriptions; the full original source tree is not fully reproduced here unless the listed files are actually present in the repository.
  • The article describes both Python and MQL5 implementations, but only some file paths are explicitly listed as attached.
  • Python package structure, auxiliary utilities, and configuration files may be incomplete or absent.
  • Installation, build, and execution steps should not be assumed beyond what the article explicitly states.
  • If the repository does not contain the attached files listed above, then the processed input should be treated as documentation-only reconstruction.

Reference