MQL5-Google-Onedrive/.jules/bolt.md
google-labs-jules[bot] a615fa0bf9 Bolt: optimize repository validation script
- Replaced `Path.rglob("*")` with `os.walk()` for faster traversal.
- Implemented efficient directory skipping in `os.walk()`.
- Combined multiple secret scanning regexes into a single pattern with named groups for single-pass scanning.
- Switched to line-by-line file reading for secret scanning to reduce memory usage.
- Updated `docs/PERFORMANCE_OPTIMIZATIONS.md` and `.jules/bolt.md`.
2026-02-26 21:01:39 +00:00

3.9 KiB

Bolt's Journal

This journal is for CRITICAL, non-routine performance learnings ONLY.

  • Codebase-specific bottlenecks
  • Failed optimizations (and why)
  • Surprising performance patterns
  • Rejected changes with valuable lessons

2024-07-25 - MQL5 Native Functions vs. Scripted Loops

Learning: My assumption that a manual MQL5 loop over a pre-cached array would be faster than built-in functions like iHighest() and iLowest() was incorrect. The code review pointed out that MQL5's native, built-in functions are implemented in highly optimized C++ and are significantly faster than loops executed in the MQL5 scripting layer. The original comment stating this was correct. Action: Always prefer using MQL5's built-in, native functions for calculations like finding highs/lows over manual loops, even if the data is already in a local array. The performance gain from the native implementation outweighs the overhead of the function call.

2026-01-23 - Python File System Checks

Learning: Checking for file existence (os.path.exists) before getting metadata (os.path.getmtime) introduces a redundant syscall. os.stat() provides both pieces of information in a single syscall and uses the EAFP (Easier to Ask for Forgiveness than Permission) pattern, which is more Pythonic and slightly faster, especially in high-frequency loops or handlers. Action: Use os.stat() when both existence and metadata are needed, wrapping it in a try...except OSError block.

2026-01-26 - yfinance Bulk Download

Learning: yfinance Ticker.history in a loop is significantly slower than yf.download with a list of tickers due to sequential HTTP requests. yf.download with group_by='ticker' provides a consistent MultiIndex structure even for single tickers, simplifying bulk processing. Action: Always prefer yf.download(tickers) over iterating yf.Ticker(t) when fetching data for multiple symbols.

2026-02-09 - Git Command Performance

Learning: git for-each-ref is a powerful tool for batch data retrieval, but without filtering, it processes all refs, including thousands of stale merged branches in older repositories. Calculating ahead-behind counts for these stale branches is O(N) where N is total branches, which can be significantly slower than O(M) where M is active branches. Action: Always filter git for-each-ref with --no-merged (or --merged depending on use case) when only interested in a subset of branches, especially when expensive formatting options like ahead-behind are used.

2026-02-14 - Markdown Parser Reuse

Learning: markdown.markdown() is a convenience function that re-initializes the Markdown parser class on every call. In high-frequency or repetitive rendering scenarios (like a dashboard), this adds significant overhead (~10% in this case). Reusing a markdown.Markdown instance via threading.local() (for thread safety in WSGI apps) eliminates this initialization cost. Action: When using markdown library in a web server or loop, instantiate markdown.Markdown once per thread/process and reuse it with .reset() instead of calling the top-level markdown.markdown() shortcut.

2026-02-15 - Repository-Wide File Scanning

Learning: For repository-wide file traversal in Python, os.walk() is consistently faster than Path.rglob("*") as it avoids the overhead of instantiating Path objects for every filesystem entry. Furthermore, modifying the dirs list in-place during os.walk() is the most efficient way to skip large or irrelevant directories (like .git or node_modules). For pattern matching (like secret scanning) across many lines, combining multiple regexes into a single pattern with named groups (?P<name>pattern) allows for single-pass scanning per line, further reducing CPU cycles. Action: Use os.walk() with in-place dirs modification for repo-wide traversal. Combine multiple regex patterns into a single named-group regex for efficient multi-pattern matching.