NUNA/.jules/bolt.md
genxapitrading 54269492eb Initial commit
2026-02-03 06:17:08 +00:00

2 KiB

2026-01-29 - Efficient Directory Cleanup with os.walk

Learning: Using pathlib.Path.rglob() to list all files and then sorting the list to clean up empty directories is a performance anti-pattern. It loads the entire directory structure into memory and incurs an O(N log N) sorting cost, which can be very slow for large directories. Action: For tasks requiring a bottom-up traversal of a directory tree (like removing empty directories), always use os.walk(path, topdown=False). This method processes directories from the deepest level up, uses minimal memory, and operates in O(N) time. It's the idiomatic and far more performant solution.

2026-01-29 - Bottlenecks in Pathlib: relative_to and resolve

Learning: pathlib.Path.relative_to() and pathlib.Path.resolve() are surprisingly slow as they perform complex lexical analysis or system calls. In high-frequency loops (e.g., scanning thousands of files), these can become significant bottlenecks. Action: Accumulate relative paths manually during recursion (using the / operator) instead of calling relative_to(root) for every file. Also, avoid redundant resolve() calls by pre-resolving base paths once outside the loop.

2026-01-31 - Field Masking for Google Drive API

Learning: Using the fields parameter in Google Drive API files().list() calls (field masking) significantly reduces the JSON payload size. This reduces network latency and client-side parsing time, especially for large accounts with thousands of files. Action: Always specify the minimum required fields when scanning or listing items from cloud APIs.

2026-01-31 - Safety vs. Performance in mkdirp Caching

Learning: A global cache for mkdirp using a set is unsafe because it doesn't account for external deletions of directories. While it provides a massive speedup (~40x), it sacrifices correctness. Action: Avoid global state for filesystem operations. Use if not p.is_dir(): as a safe way to speed up Path.mkdir(parents=True, exist_ok=True) by ~2x without sacrificing robustness.