Read original ↗
paperarXivTrust 82 · PrimaryPublished 2d agoLive · 21h ago

Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

In long-context use, large language models frequently synthesize answers from the meaning of a relevant context span rather than literally copy-pasting them. Identifying which attention heads perform this synthesis matters for interpreting long-context model behavior. Yet existing detectors miss these heads by construction: they reward heads whose attended token matches the generated token, a literal-copy criterion that captures where a head reads but not what it writes through its output-value (OV) circuit, the very mechanism that carries non-literal retrieval. We introduce Logit-Contribution

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorAryo Pradipta Gema

    Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

  • Linked via arxiv authorBeatrice Alex

    Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

  • Linked via arxiv authorPasquale Minervini

    Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

Covers

Related to

authored (incoming)

Implements (incoming)

Covers (incoming)

Related across the graph

Topics