Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 5h ago

Distributed Attacks in Persistent-State AI Control

As AI coding agents become more autonomous, they increasingly ship code iteratively, with the codebase persisting across sessions. This persistence creates a new attack surface: a misaligned or prompt-injected agent can distribute attacks across pull requests (PRs) and time its payload for the PR with the best natural cover. To study the resulting dynamics, we introduce Iterative VibeCoding, a setting for AI control, the study of safely deploying capable but potentially untrusted AI. In Iterative VibeCoding, a coding agent builds software over a sequence of PRs in a persistent codebase while p

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorJosh Hills

    Distributed Attacks in Persistent-State AI Control

  • Linked via arxiv authorIda Caspary

    Distributed Attacks in Persistent-State AI Control

  • Linked via arxiv authorAsa Cooper Stickland

    Distributed Attacks in Persistent-State AI Control

Implements

Covers

authored (incoming)

Implements (incoming)

Related across the graph

Topics