Read original ↗
paperarXivTrust 82 · PrimaryPublished yesterdayLive · 19h ago

AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

Vision-Language Models (VLMs) have demonstrated immense promise in Spatio-Temporal Video Grounding (STVG). However, current evaluation protocols are largely confined to zero-shot assessments on general, daily-life benchmarks. This creates a critical disconnect from real-world applications in specialized fields, where models inevitably encounter rare visual concepts and complex spatio-temporal dynamics. Since exhaustive pre-training across infinite data distributions is infeasible, the ability to adapt to novel domains is essential. To bridge this gap, we introduce AnyGroundBench, a domain-adap

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

  • Linked via arxiv authorRintaro Otsubo

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorRyo Fujii

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorReina Ishikawa

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorTaiki Kanaya

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorKanta Sawafuji

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorHiroki Kajita

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorShigeki Sakai

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorHideo Saito

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

  • Linked via arxiv authorRyo Hachiuma

    AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models

Implements

Has model

authored (incoming)

Implements (incoming)

Related across the graph

Topics