paperarXivTrust 82 · PrimaryPublished yesterdayLive · 5h ago

Distributed Attacks in Persistent-State AI Control

As AI coding agents become more autonomous, they increasingly ship code iteratively, with the codebase persisting across sessions. This persistence creates a new attack surface: a misaligned or prompt-injected agent can distribute attacks across pull requests (PRs) and time its payload for the PR with the best natural cover. To study the resulting dynamics, we introduce Iterative VibeCoding, a setting for AI control, the study of safely deploying capable but potentially untrusted AI. In Iterative VibeCoding, a coding agent builds software over a sequence of PRs in a persistent codebase while p

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorJosh Hills →
Distributed Attacks in Persistent-State AI Control
Linked via arxiv authorIda Caspary →
Distributed Attacks in Persistent-State AI Control
Linked via arxiv authorAsa Cooper Stickland →
Distributed Attacks in Persistent-State AI Control

Implements

repoopensandbox-group/OpenSandbox repoTh0rgal/sandboxed.sh repoJKHeadley/instar repotenuo-ai/tenuo

Covers

newsSecuring the future of AI agents

authored (incoming)

personJosh Hills personIda Caspary personAsa Cooper Stickland

Implements (incoming)

repoTencent/AI-Infra-Guard repoOmYarewar/PHANTOM repoapache/hertzbeat

Related across the graph

repoTh0rgal/sandboxed.sh repoJKHeadley/instar repoOmYarewar/PHANTOM repoTencent/AI-Infra-Guard personAsa Cooper Stickland repotenuo-ai/tenuo newsSecuring the future of AI agents repoopensandbox-group/OpenSandbox personIda Caspary personJosh Hills repoapache/hertzbeat

Topics

cs.AI