paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

FacePlex: Full-Duplex Joint Speech-Facial Motion Generation for Conversational Avatars

Natural face-to-face conversation requires real-time speech generation together with synchronized facial motion. Existing systems only partially address this problem: speech-only full-duplex models can generate speech in real time but do not produce facial motion, while audio-driven facial motion models animate a face from already available audio rather than jointly generating speech and motion online. To bridge this gap, we first formalize full-duplex joint speech-facial motion generation, where speech tokens and facial motion tokens are produced together every step. Building on this formulat

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Covers

newsFluid, natural voice translation with Gemini 3.5 Live Translate

Has model

modeldeepseek-ai/DeepSeek-V3-0324 modelmicrosoft/phi-2

Covers (incoming)

newsHugging Face and Cerebras bring Gemma 4 to real-time voice AI

Related across the graph

modelmicrosoft/phi-2 newsHugging Face and Cerebras bring Gemma 4 to real-time voice AI newsFluid, natural voice translation with Gemini 3.5 Live Translate modeldeepseek-ai/DeepSeek-V3-0324

Topics

cs.CV