← back
arXivLakshya A Agrawal, Donghyun Lee, Shangyin Tan, Wenjie Ma, Karim Elmaaroufi, Rohit Sandadi, Sanjit A. Seshia, Koushik Sen, Dan Klein, Ion Stoica, Joseph E. Gonzalez, Omar Khattab, Alexandros G. Dimakis, Matei ZahariaTue, May 19, 2026, 3:18 AM PDT
score 15.3
4HN1HN cmts1cites

One AI System Solves Scheduling, Code, and Design Problems Better

Original: optimize_anything: A Universal API for Optimizing any Text Parameter

Source: arxiv.org

Who: Submitted to arXiv by lead author Lakshya A. Agrawal alongside 13 co-authors from UC Berkeley, including Matei Zaharia, Ion Stoica, Dan Klein, and Koushik Sen — a cluster of researchers behind several foundational open-source systems — plus Omar Khattab, known for work on language-model-driven programming frameworks.

What's new: The team introduces optimize_anything, a single -based system that can improve almost any solution expressed as text, without needing a specialized tool built for each problem. The core idea is that if you can write down what "good" looks like as a score, the system can search for a better version of your solution by repeatedly drafting revisions and checking the score — the way a student might rewrite an essay after seeing a rubric grade.

How it works: The system treats every problem the same way: a candidate solution is a piece of text, and a scoring function says how good it is. The proposes revisions, sees the score plus any explanatory feedback, and iterates. Crucially, the system supports multi-task search, meaning it can work on several related problems simultaneously and transfer lessons learned on one problem to speed up progress on another — similar to how studying for one history exam helps you on a related one. The researchers found that giving the model a written explanation of why a score is what it is, rather than just the number, leads to much faster improvement.

The numbers: Results across six tasks are striking. On , the system raised 's accuracy from 32.5% to 89.5%. It found scheduling algorithms that cut cloud computing costs by 40%. It generated low-level GPU code where 87% of outputs matched or beat 's performance. It also surpassed the best-known solution from on a classical geometry puzzle involving fitting circles into a fixed space.

Why it matters: The practical implication is that a developer who can write a scoring function — "does this scheduler waste less money?" — no longer needs to design a custom optimization algorithm from scratch. The same engine handles code generation, scheduling, puzzle-solving, and potentially much else. The project is open-sourced as part of the GEPA project at UC Berkeley, which means outside researchers can plug it into their own problems immediately.