arXivJinnuo Liu, Yue Peng, Jinhan Niu, Hongyi WenTue, Jun 2, 2026, 6:46 AM PDT
score 17.1
Benchmark reveals how AI learns to use unfamiliar code libraries
Original: Diagnosing Knowledge Gaps in LLM Tool Use: An Agentic Benchmark for Novel API Acquisition
Source: arxiv.org ↗
Writing ELI5 summary…