New AI model runs more tests and costs more

Original: We ran Sonnet 5 in Ramp SWE-Bench, and observed that compared to its predecessor it:

Writing ELI5 summary…