← back
arXivGueter Josmy Faure, Min-Hung Chen, Jia-Fong Yeh, Hung-Ting Su, Winston H. HsuTue, May 19, 2026, 6:40 AM PDT
score 16.4

New benchmark tests video AI on detailed human actions

Original: FineBench: Benchmarking and Enhancing Vision-Language Models for Fine-grained Human Activity Understanding

Source: arxiv.org

Writing ELI5 summary…