← back
arXivChao Wen, Jacqueline Staub, Adish SinglaTue, Jun 2, 2026, 6:25 AM PDT
score 17.1

Benchmark reveals AI struggles with visual geometry coding tasks

Original: TurtleAI: Benchmarking Multimodal Models for Visual Programming in Turtle Graphics

Source: arxiv.org

Writing ELI5 summary…