person profile

Haiwen Hong

Haiwen Hong — researcher or builder tracked in the Angestrom contributor network.

14Connections

1Papers

0Models

0Repos

0News

Papers · 1

Perceive-to-Reason: Decoupling Perception and Reasoning for Fine-Grained Visual Reasoning

Fine-grained visual reasoning remains challenging for vision-language models, especially when small but critical visual cues are buried in high-resolution images. Existing approaches rely on repeated cropping or test-time visual search to introduce local evidence, but they typically do not explicitly distinguish perception from reasoning. In this paper, we propose Perceive-to-Reason (P2R), a unified framework that formulates fine-grained visual reasoning as a two-stage process: the model first localizes question-relevant evidence as a Perceiver, and then answers the question as a Reasoner base