Abstract: Multimodal (MM) large language models (MLLMs) have achieved remarkable success in image- and region-level remote sensing (RS) image understanding tasks, such as image captioning (IC), visual ...
Semantic segmentation of remote sensing images is pivotal for comprehensive Earth observation, but the demand for interpreting new object categories, coupled with the high expense of manual annotation ...
Abstract: In the realm of computer vision (CV), balancing speed and accuracy remains a significant challenge. Recent efforts have focused on developing lightweight networks that optimize computational ...