paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

Forensic Trajectory Signatures for Agent Memory Poisoning Detection

We discover a behavioral invariant in LLM agents under persistent memory poisoning: in architectures where routing information is retrieved through observable memory-tool invocations, successful attacks require calling memory_recall_fact before email_send_email, a transition that non-exfiltrating sessions rarely exhibit. Under the evaluated architecture, this invariant follows from the attack's information-retrieval dependency rather than being merely an empirical correlation, and suppressing it breaks the attack. A simple rule exploiting this invariant alone achieves AUC = 0.9563. A Random Fo

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Implements

repoNoshkoto/Noshy repolas7/memharness

Covers

newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]

Covers (incoming)

newsStructured memory filtering with metadata in AgentCore Memory

Related across the graph

newsI built an open-source memory governance layer for AI assistants - looking for technical feedback [P]newsStructured memory filtering with metadata in AgentCore Memory newsEvaluating long-term memory limits in stateless LLM chatbots — feedback needed [D]repolas7/memharness repoNoshkoto/Noshy

Topics

cs.LG