← back
arXivJongoh Jeong, Hoyong Kwon, Minseok Kim, Kuk-Jin YoonFri, May 22, 2026, 3:41 AM PDT
score 15.3

Faster way to compress vision-language training data into smaller datasets

Original: Multimodal Distribution Matching for Vision-Language Dataset Distillation

Source: arxiv.org

Writing ELI5 summary…