newsArs Technica AITrust 88 · LabPublished 3d agoLive · 3d ago
AI browsers can be lulled into a dream world where guardrails no longer apply
Telling an LLM that 2 + 2 = 5 is enough to make it follow forbidden instructions.
Telling an LLM that 2 + 2 = 5 is enough to make it follow forbidden instructions.