New from Together Research: LLMs can fix the query plans your database optimizer gets wrong, resulting in up to 4.78x faster execution. Cost-based optimizers fail when they miss semantic correlations. A filter that prunes 15M rows to 2.9M gets applied after a join instead of before — because the optimizer assumed independence where none existed. DBPlanBench exposes DataFusion's physical operator graph to an LLM, which applies targeted JSON patch edits to fix join ordering without regenerating the full plan. On TPC-H and TPC-DS workloads: → Up to 4.78x speedup on complex multi-join queries → 60.8% of queries improved by more than 5% → Build memory: 3.3 GB → 411 MB on a single benchmark query → Plans optimized at small scale transfer directly to larger databases Paper and code are open-source. Link in comments.
This introduces a new layer in the stack. Not replacing the optimizer, but correcting it post-hoc. That hybrid approach is where things get interesting.
Blog: https://www.together.ai/blog/using-llms-to-optimize-database-query-execution Paper: https://arxiv.org/pdf/2602.10387 Code: https://github.com/BauplanLabs/Making-Databases-Faster-with-LLM-Evolutionary-Sampling