paperarXivTrust 82 · PrimaryPublished yesterdayLive · 7h ago

PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation

State-of-the-art single-image 3D reconstruction methods often rely on complex hybrid architectures and loss functions, or compress geometry into latent spaces in order to leverage pre-trained latent diffusion models. In this work, we show that such architectural overhead and intricate loss formulations are unnecessary. We introduce a minimalist pixel-space Diffusion Transformer, built on a plain ViT, that operates directly on raw 3D point map patches and is conditioned on image tokens from a pre-trained DINOv3. Unlike existing latent diffusion approaches, we train our diffusion backbone entire

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorHaofei Xu →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorRundi Wu →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorPhilipp Henzler →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorNikolai Kalischek →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorMichael Oechsle →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorFabian Manhardt →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorMarc Pollefeys →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorAndreas Geiger →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorFederico Tombari →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation
Linked via arxiv authorMichael Niemeyer →
PointDiT: Pixel-Space Diffusion for Monocular Geometry Estimation

Has model

modelDiffuse-XL

authored (incoming)

personHaofei Xu personRundi Wu personPhilipp Henzler personNikolai Kalischek personMichael Oechsle personFabian Manhardt personMarc Pollefeys personAndreas Geiger personFederico Tombari personMichael Niemeyer

Implements (incoming)

repoisl-org/Open3D

Related across the graph

personRundi Wu repoisl-org/Open3D personMichael Niemeyer personAndreas Geiger personFabian Manhardt personMichael Oechsle modelDiffuse-XL personPhilipp Henzler personMarc Pollefeys personFederico Tombari personHaofei Xu personNikolai Kalischek

Topics

cs.CV