QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
Scaling inference compute, by generating many parallel attempts per problem, is a costly but reliable lever for improving language model capabilities. By default these attempts are generated independently, wasting inference compute on redundant solutions. This waste seems unavoidable. After all, independence is what makes parallel sampling trivial to scale. However, this tradeoff is not fundamental: there is a rich design space of samplers that generate correlated but exact samples entirely in parallel. We explore this design space as an avenue for improving sample efficiency in scaling infere
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorMichael Y. Li →
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
- Linked via arxiv authorAnthony Zhan →
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
- Linked via arxiv authorKanishk Gandhi →
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
- Linked via arxiv authorNoah D. Goodman →
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
- Linked via arxiv authorEmily B. Fox →
QuasiMoTTo: Quasi-Monte Carlo Test-Time Scaling
