newsReddit r/MachineLearningTrust 72 · CommunityPublished 2d agoLive · 2d ago

A system-level approach to prompt injection: separating instruction and data channels in LLM agents [P]

Prompt injection has emerged as one of the most persistent failure modes in tool-using LLM systems, particularly in agentic workflows where models interact with external data sources. Most mitigation strategies focus on input filtering or model-side alignment, but these approaches struggle because the core issue is structural: Approach I explored a system-level mitigation strategy by introducing a middleware laye

Covers

repoagent-tools paperPolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents paperA Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open Problems paperWhen the Database Fails: Prompting LLM Dialogue Agents for Safe Recovery in Task-Oriented Dialogue paperLinguistic Firewall: Geometry as Defense in Multi-Agent Systems Routing

Covers (incoming)

paperAutomating Cause-Effect Specification with Knowledge Graphs and Large Language Models paperA Tutorial on Autonomous Fault-Tolerant Control Using Knowledge-Grounded LLM Agents paperEntity Binding Failures in Tool-Augmented Agents paperWords Speak Louder Than Code: Investigating Cognitive Heuristics in LLM-Based Code Vulnerability Detection paperMESA: Prioritizing Vulnerable Communication Channels for Securing Multi-Agent Systems paperMCP Server Architecture Patterns for LLM-Integrated Applications paperSWE-Doctor: Guiding Software Engineering Agents with Runtime Diagnosis from Multi-Faceted Bug Reproduction Tests paperDistill to Detect: Exposing Stealth Biases in LLMs through Cartridge Distillation repoaallan/vera repopromptfoo/promptfoo repolanggenius/dify repoPipelex/pipelex repolotus-data/lotus

Related across the graph

paperMCP Server Architecture Patterns for LLM-Integrated Applications paperWords Speak Louder Than Code: Investigating Cognitive Heuristics in LLM-Based Code Vulnerability Detection paperA Tutorial on Autonomous Fault-Tolerant Control Using Knowledge-Grounded LLM Agents repolanggenius/dify paperLinguistic Firewall: Geometry as Defense in Multi-Agent Systems Routing repoPipelex/pipelex paperPolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents paperMESA: Prioritizing Vulnerable Communication Channels for Securing Multi-Agent Systems paperDistill to Detect: Exposing Stealth Biases in LLMs through Cartridge Distillation paperEntity Binding Failures in Tool-Augmented Agents repolotus-data/lotus paperSWE-Doctor: Guiding Software Engineering Agents with Runtime Diagnosis from Multi-Faceted Bug Reproduction Tests repopromptfoo/promptfoo paperAutomating Cause-Effect Specification with Knowledge Graphs and Large Language Models paperA Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open Problems paperWhen the Database Fails: Prompting LLM Dialogue Agents for Safe Recovery in Task-Oriented Dialogue repoagent-tools repoaallan/vera