Distributed Attacks in Persistent-State AI Control
As AI coding agents become more autonomous, they increasingly ship code iteratively, with the codebase persisting across sessions. This persistence creates a new attack surface: a misaligned or prompt-injected agent can distribute attacks across pull requests (PRs) and time its payload for the PR with the best natural cover. To study the resulting dynamics, we introduce Iterative VibeCoding, a setting for AI control, the study of safely deploying capable but potentially untrusted AI. In Iterative VibeCoding, a coding agent builds software over a sequence of PRs in a persistent codebase while p
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorJosh Hills →
Distributed Attacks in Persistent-State AI Control
- Linked via arxiv authorIda Caspary →
Distributed Attacks in Persistent-State AI Control
- Linked via arxiv authorAsa Cooper Stickland →
Distributed Attacks in Persistent-State AI Control
