This small AI re-reads the hard parts — and keeps up with models five times its size
Mira Castellanos, Dawei Lin, Jonas Brekke et al.~60s readarXiv:2605.12473
Read a tricky sentence once and you might miss it. Read it three times and it clicks. Most AI language models never get that option — every word passes through their circuitry exactly once, whether the question is trivial or brutal. This paper changes that: it lets a small model loop back and re-run its own layers on the hard parts, deciding on the fly how many passes each word deserves.
The researchers trained a model with 1.3 billion parameters — small enough to run on a gaming laptop — that routes easy words through one pass and knotty reasoning steps through up to eight. On math and logic tests it matched models five times its size, while spending about a third less computing power than those bigger models need to answer.
The catch: looping takes time, so answers on hard problems arrive a bit slower. The little router that decides what counts as hard needs careful training — get it wrong and the model wastes effort on easy words while rushing the hard ones. And the team only tested up to 1.3 billion parameters; nobody knows yet whether the trick keeps paying off at the scale of the biggest models.
Why you should care: the price of AI is mostly the price of big models. If small models can think harder instead of just growing bigger, a genuinely smart assistant stops costing data-center money and starts costing laptop money — and that decides which products you use can afford to be intelligent.