paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

ECHO: Prune to act, trace to learn with selective turn memory in agentic RL

Long-horizon language agents must repeatedly interact with tools, accumulate evidence, and make decisions under bounded context windows. Existing context-management methods make such rollouts feasible by truncating distant history, folding past turns into summaries, or selecting compact memory states. However, these breakthroughs introduce two coupled limitations. First, as the number of turns grows, historical observations are progressively removed or collapsed into compressed states, making it harder for the policy to reuse fine-grained evidence. Second, once the original turns are no longer

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]newsAgents collapse "observed", "concluded", and "generated" into one confidence level. Is anyone modeling epistemic status directly instead of just improving retrieval? [D]

Implements

repoNoshkoto/Noshy

Related across the graph

newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]newsAgents collapse "observed", "concluded", and "generated" into one confidence level. Is anyone modeling epistemic status directly instead of just improving retrieval? [D]repoNoshkoto/Noshy

Topics

cs.AI