paperarXivTrust 82 · PrimaryPublished yesterdayLive · 7h ago

Alignment Is All You Need For X-to-4D Generation

Generative diffusion models excel at synthesizing high-quality images, videos, and 3D content under multimodal control. However, arbitrary user-defined modality-to-4D (X-to-4D) generation remains challenging due to the high cost of constructing diverse datasets and the limited scalability of existing methods. This paper presents Align4D, a flexible framework that translates any-modal input into coherent video-3D pairs, using video to guide 4D motion and 3D data to shape 4D geometry. Align4D introduces three key techniques: (1) Object Distance Alignment, which searches Video-Aligned and Multivi

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorQiaowei Miao →
Alignment Is All You Need For X-to-4D Generation
Linked via arxiv authorKehan Li →
Alignment Is All You Need For X-to-4D Generation
Linked via arxiv authorYawei Luo →
Alignment Is All You Need For X-to-4D Generation
Linked via arxiv authorYi Yang →
Alignment Is All You Need For X-to-4D Generation

authored (incoming)

personQiaowei Miao personKehan Li personYawei Luo personYi Yang

Related across the graph

personQiaowei Miao personKehan Li personYi Yang personYawei Luo

Topics

cs.CV