arXivCongrui Du, Yang Zhang, Kaizhi Qian, Shiyu ChangThu, Jul 2, 2026, 7:22 AM PDT
score 17.0
Method combines text LLM weights to create instruction-following speech model without costly tuning
Original: Unlocking Speech-Text Compositional Powers: Instruction-Following Speech Language Models without Instruction Tuning
Source: arxiv.org ↗
Writing ELI5 summary…