Read original ↗
paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

ECHO: Prune to act, trace to learn with selective turn memory in agentic RL

Long-horizon language agents must repeatedly interact with tools, accumulate evidence, and make decisions under bounded context windows. Existing context-management methods make such rollouts feasible by truncating distant history, folding past turns into summaries, or selecting compact memory states. However, these breakthroughs introduce two coupled limitations. First, as the number of turns grows, historical observations are progressively removed or collapsed into compressed states, making it harder for the policy to reuse fine-grained evidence. Second, once the original turns are no longer

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

Implements

Related across the graph

Topics