Read original ↗
paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

Towards in-the-wild Egocentric 3D Hand-Object Pose Estimation

Estimating accurate 3D hand-object pose from in-the-wild egocentric RGB remains challenging due to severe occlusions and ambiguous contact. Existing learning-based methods often struggle to generalise to in-the-wild scenes and are limited by the scarcity of supervision. We address these issues with two contributions. First, we introduce EPIC-Contact, an in-the-wild egocentric dataset of 2.3K clips (62.3K frames) with dense, bijective 3D hand-object contact correspondences and posed meshes. Second, we propose HOPformer, an end-to-end transformer that jointly predicts bi-manual hand and object p

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Topics