paperarXivTrust 82 · PrimaryPublished 3d agoLive · 2d ago

Improving Certified Robustness via Adversarial Distillation

Certified training aims to produce models whose predictions can be formally verified against adversarial perturbations, typically by optimising upper bounds on the worst-case loss over an allowed perturbation set. For neural networks, certified training methods based purely on tight relaxation bounds produce networks that are amenable to certification, but sacrifice standard accuracy. Conversely, adversarial training often yields stronger empirical robustness and standard accuracy, but the resulting models are generally difficult to certify with neural network verifiers. Recently, the literatu

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Topics

cs.AI