Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework
Authors/Creators
Description
📢 This dataset is published in IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (JSTARS), 2025.
🔗 IEEE Xplore Link
📄 DOI: 10.1109/JSTARS.2025.3600613
Abstract: Existing remote sensing change captioning (RSICC) methods often fail under challenges like illumination differences, viewpoint changes, and blur effects, leading to inaccuracies, especially in no-change regions. Moreover, images acquired at different spatial resolutions and with registration errors tend to affect the captions. To address these issues, we introduce SECOND-CC, a novel RSICC dataset featuring high-resolution RGB image pairs, semantic segmentation maps, and diverse real-world scenarios. SECOND-CC contains 6 041 pairs of bitemporal remote sensing images and 30 205 sentences describing the differences between the images. Additionally, we propose MModalCC, a multimodal framework that integrates semantic and visual data using advanced attention mechanisms, including Cross-Modal Cross Attention and Multimodal Gated Cross Attention. In addition, we adapt MModalCC to handle noisy semantic inputs by integrating a Semantic Change Detector, improving its robustness for real-world applications. Detailed ablation studies and attention visualizations further demonstrate its effectiveness and ability to address the challenges of RSICC. Comprehensive experiments show that MModalCC outperforms state-of-the-art RSICC methods, including RSICCformer, Chg2Cap, and PSNet with +4.6% improvement on BLEU4 score and +9.6% improvement on CIDEr score in SECOND-CC dataset. MModalCC was further validated on the LEVIR-MCI benchmark, where it achieved an average S∗m score of 83.51, significantly outperforming previous state-of-the-art methods. We will make our dataset and codebase publicly available to facilitate future research at https://github.com/ChangeCapsInRS/SecondCC.
Files
SECOND-CC-AUG.zip
Files
(2.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:ca930ddb819d68a797938b940d1711f1
|
2.5 GB | Preview Download |
Additional details
Related works
- Is described by
- Publication: 10.1109/JSTARS.2025.3600613 (DOI)
Funding
- TUBITAK BILGEM
- 122E666
Software
- Repository URL
- https://github.com/ChangeCapsInRS/SecondCC
- Programming language
- Python
- Development Status
- Active