Rethinking Scaling: Robustness Tradeoffs in Parameter-Matched NLP Models
Abstract
“Benchmarking should no longer reward the model that climbs highest. It should reveal the one that falls slowest. What matters is not peak performance — but how gracefully we fa...