Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 7h ago

Alignment Is All You Need For X-to-4D Generation

Generative diffusion models excel at synthesizing high-quality images, videos, and 3D content under multimodal control. However, arbitrary user-defined modality-to-4D (X-to-4D) generation remains challenging due to the high cost of constructing diverse datasets and the limited scalability of existing methods. This paper presents Align4D, a flexible framework that translates any-modal input into coherent video-3D pairs, using video to guide 4D motion and 3D data to shape 4D geometry. Align4D introduces three key techniques: (1) Object Distance Alignment, which searches Video-Aligned and Multivi

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorQiaowei Miao

    Alignment Is All You Need For X-to-4D Generation

  • Linked via arxiv authorKehan Li

    Alignment Is All You Need For X-to-4D Generation

  • Linked via arxiv authorYawei Luo

    Alignment Is All You Need For X-to-4D Generation

  • Linked via arxiv authorYi Yang

    Alignment Is All You Need For X-to-4D Generation

authored (incoming)

Related across the graph

Topics