paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago
OLIVE: View-Augmented Latent Prediction with Waveform Reconstruction for Speech SSL
We propose Online Latent prediction with Invariant Views and rEconstruction (OLIVE), a self-supervised speech representation learning framework that jointly optimizes analysis and synthesis objectives. OLIVE combines view-augmented masked latent prediction with waveform reconstruction under a unified objective. Reconstruction constrains early encoder features to retain signal-level information, while masked latent prediction shapes later contextual representations toward invariance for robust downstream performance. We show that these objectives enable representations that support a broad rang
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
