Read original ↗
paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

FacePlex: Full-Duplex Joint Speech-Facial Motion Generation for Conversational Avatars

Natural face-to-face conversation requires real-time speech generation together with synchronized facial motion. Existing systems only partially address this problem: speech-only full-duplex models can generate speech in real time but do not produce facial motion, while audio-driven facial motion models animate a face from already available audio rather than jointly generating speech and motion online. To bridge this gap, we first formalize full-duplex joint speech-facial motion generation, where speech tokens and facial motion tokens are produced together every step. Building on this formulat

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

Has model

Covers (incoming)

Related across the graph

Topics