AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
Vision-Language Models (VLMs) have demonstrated immense promise in Spatio-Temporal Video Grounding (STVG). However, current evaluation protocols are largely confined to zero-shot assessments on general, daily-life benchmarks. This creates a critical disconnect from real-world applications in specialized fields, where models inevitably encounter rare visual concepts and complex spatio-temporal dynamics. Since exhaustive pre-training across infinite data distributions is infeasible, the ability to adapt to novel domains is essential. To bridge this gap, we introduce AnyGroundBench, a domain-adap
Lineage graph
Paper → model → repo connections mined from source citations (Tier-1 exact match).
Why these links exist
- Linked via arxiv authorRintaro Otsubo →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorRyo Fujii →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorReina Ishikawa →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorTaiki Kanaya →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorKanta Sawafuji →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorHiroki Kajita →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorShigeki Sakai →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorHideo Saito →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
- Linked via arxiv authorRyo Hachiuma →
AnyGroundBench: A Specialized-Domain Benchmark for Video Grounding in Vision-Language Models
