← back
arXivMaryam Haghifam, Zifan He, Jason Cong, Yizhou SunSun, May 24, 2026, 1:22 AM PDT
score 16.1

Hierarchical memory helps transformers handle long documents faster

Original: H$^{2}$MT: Semantic Hierarchy-Aware Hierarchical Memory Transformer

Source: arxiv.org

Writing ELI5 summary…