Optimize `scripts/merge_best_prs.py` by using `concurrent.futures.ThreadPoolExecutor` to close duplicate PRs in parallel. This significantly reduces execution time by masking network latency for independent API calls.
Benchmarks show a reduction from ~1.31s to ~0.25s for closing 13 PRs with simulated 0.1s latency.