newsReddit r/MachineLearningTrust 72 · CommunityPublished 2d agoLive · 2d ago
A system-level approach to prompt injection: separating instruction and data channels in LLM agents [P]
Prompt injection has emerged as one of the most persistent failure modes in tool-using LLM systems, particularly in agentic workflows where models interact with external data sources. Most mitigation strategies focus on input filtering or model-side alignment, but these approaches struggle because the core issue is structural: Approach I explored a system-level mitigation strategy by introducing a middleware laye
Covers
repoagent-toolspaperPolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM AgentspaperA Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open ProblemspaperWhen the Database Fails: Prompting LLM Dialogue Agents for Safe Recovery in Task-Oriented DialoguepaperLinguistic Firewall: Geometry as Defense in Multi-Agent Systems Routing
Covers (incoming)
paperAutomating Cause-Effect Specification with Knowledge Graphs and Large Language ModelspaperA Tutorial on Autonomous Fault-Tolerant Control Using Knowledge-Grounded LLM AgentspaperEntity Binding Failures in Tool-Augmented AgentspaperWords Speak Louder Than Code: Investigating Cognitive Heuristics in LLM-Based Code Vulnerability DetectionpaperMESA: Prioritizing Vulnerable Communication Channels for Securing Multi-Agent SystemspaperMCP Server Architecture Patterns for LLM-Integrated ApplicationspaperSWE-Doctor: Guiding Software Engineering Agents with Runtime Diagnosis from Multi-Faceted Bug Reproduction TestspaperDistill to Detect: Exposing Stealth Biases in LLMs through Cartridge Distillationrepoaallan/verarepopromptfoo/promptfoorepolanggenius/difyrepoPipelex/pipelexrepolotus-data/lotus
Related across the graph
paperMCP Server Architecture Patterns for LLM-Integrated ApplicationspaperWords Speak Louder Than Code: Investigating Cognitive Heuristics in LLM-Based Code Vulnerability DetectionpaperA Tutorial on Autonomous Fault-Tolerant Control Using Knowledge-Grounded LLM Agentsrepolanggenius/difypaperLinguistic Firewall: Geometry as Defense in Multi-Agent Systems RoutingrepoPipelex/pipelexpaperPolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM AgentspaperMESA: Prioritizing Vulnerable Communication Channels for Securing Multi-Agent SystemspaperDistill to Detect: Exposing Stealth Biases in LLMs through Cartridge DistillationpaperEntity Binding Failures in Tool-Augmented Agentsrepolotus-data/lotuspaperSWE-Doctor: Guiding Software Engineering Agents with Runtime Diagnosis from Multi-Faceted Bug Reproduction Testsrepopromptfoo/promptfoopaperAutomating Cause-Effect Specification with Knowledge Graphs and Large Language ModelspaperA Lifecycle and Application-Stack Survey of Large Language Model Vulnerabilities: Attacks, Risks, Defenses, and Open ProblemspaperWhen the Database Fails: Prompting LLM Dialogue Agents for Safe Recovery in Task-Oriented Dialoguerepoagent-toolsrepoaallan/vera
