Read original ↗
paperarXivTrust 82 · PrimaryPublished 4d agoLive · 3d ago

Goku: A Million-Scale Universal Dataset and Benchmark for Instruction-Based Video Editing

Existing instruction-based video editing datasets commonly focus on single-task appearance editing, failing to meet the complex creative demands of real-world scenarios. To bridge this gap, we present Goku, a large-scale dataset featuring 2 million high-quality, instruction-aligned video editing pairs, which is the first to extend task boundaries from basic appearance editing to multi-task and structural manipulations(e.g., precise control of subject movement). To tackle the data synthesis challenges inherent in these complex tasks, we design an efficient data synthesis pipeline that decompose

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Implements

Covers

Implements (incoming)

Related across the graph

Topics