paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago
ECHO: Prune to act, trace to learn with selective turn memory in agentic RL
Long-horizon language agents must repeatedly interact with tools, accumulate evidence, and make decisions under bounded context windows. Existing context-management methods make such rollouts feasible by truncating distant history, folding past turns into summaries, or selecting compact memory states. However, these breakthroughs introduce two coupled limitations. First, as the number of turns grows, historical observations are progressively removed or collapsed into compressed states, making it harder for the policy to reuse fine-grained evidence. Second, once the original turns are no longer
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Covers
newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]newsAgents collapse "observed", "concluded", and "generated" into one confidence level. Is anyone modeling epistemic status directly instead of just improving retrieval? [D]
Implements
Related across the graph
newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]newsAgents collapse "observed", "concluded", and "generated" into one confidence level. Is anyone modeling epistemic status directly instead of just improving retrieval? [D]repoNoshkoto/Noshy
