Replaces `rglob` with `os.walk` using `topdown=True` in `scripts/ci_validate_repo.py`.
This allows pruning large ignored directories (like `node_modules` or `.git`) *before*
traversal, significantly reducing the number of files scanned.
Performance:
- Benchmarked ~2.4x faster in the current environment (0.045s -> 0.019s).
- Impact scales with the size of ignored directories.
Verification:
- Added `scripts/verify_secret_scanning.py` to ensure secret scanning logic
correctly detects secrets in tracked files and ignores them in excluded directories.
Improved memory efficiency in `scripts/ci_validate_repo.py` by replacing
`read_bytes()` with chunked binary reading (64KB) for NUL byte detection.
This prevents loading entire source files into memory during CI runs.
- Consolidate file validation logic into a single loop
- Check file size before reading content to prevent loading large files into memory
- Add error handling for file reading
- Reduce I/O operations and loop iterations
This improves the efficiency of the CI validation script, especially for repositories with many files or when large files are accidentally introduced.