paperarXivTrust 82 · PrimaryPublished 7d agoLive · 4d ago

RSICCLLM: A Multimodal Large Language Model for Remote Sensing Image Change Captioning

Remote Sensing Image Change Captioning (RSICC) aims to describe changes between bi-temporal remote sensing images and holds significant research and application value. However, most existing methods rely on conventional deep learning architectures, and the limited model capacity constrains performance. Although large-model post-training techniques have achieved great success in general domains, their direct transfer to RSICC remains challenging due to data scarcity and the need for fine-grained change understanding. To address this, we propose RSICCLLM, the first post-training framework for la

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Has model

modelVioletVision-3B

Implements

repovlm-starter

Related across the graph

modelVioletVision-3B repovlm-starter

Topics

cs.CV