Published March 9, 2026 | Version v1
Dataset Open

ARAS400k: A Large-Scale Remote Sensing Dataset Augmented with Synthetic Data for Segmentation and Captioning

  • 1. ROR icon Middle East Technical University

Description

ARAS400k

We introduce the fully open-sourced ARAS400k, a comprehensive remote sensing dataset consisting of 100,240 real images and 300,000 synthetic images. Each image is 256x256 and paired with semantic segmentation maps and 5 descriptive captions. ARAS400k contains 400,240 images and 2,001,200 descriptive captions.

Subset True Color Images Segmentation Maps Captions
Train 80,192 80,192 400,960
Validation 10,024 10,024 50,120
Test 10,024 10,024 50,120
Real Total 100,240 100,240 501,200
Synthetic 300,000 300,000 1,500,000
ARAS400k Total 400,240 400,240 2,001,200

 

ARAS400k folder contents

  • Subset (train,val,test,synth)
    • images
      • 0000.png
      • 0001.png
      • ...
    • masks
      • 0000.png
      • 0001.png
      • ...
    • captions.csv

An example from synth folder, captions.csv containing filename (unique) split (train-val-test-synth), class percentages (Tree, Shrub, Grass, Crop, Built-up, Barren, Water) and captions for each method: hybrid_gemma3-4b, hybrid_qwen3-vl-8b, text_qwen3-4b, vision_gemma3-4b, vision_qwen3-vl-8b

filename split

Tree, Shrub, Grass, Crop, Built-up, Barren, Water

hybrid_gemma3-4b hybrid_qwen3-vl-8b text_qwen3-4b vision_gemma3-4b vision_qwen3-vl-8b
0000.png synth 0,0,18,82,0,0,0 The image depicts a landscape dominated by cultivated crops (82%), interspersed with smaller areas of grass (18%). The scene exhibits a patterned arrangement of fields, suggesting agricultural land use with some topographic variation. The scene is dominated by crop fields, covering 82% of the area in a patchwork of geometrically shaped agricultural plots, with grasslands making up the remaining 18% in interspersed, smaller areas. The scene is primarily agricultural, with 82% of the area covered by crops, indicating a dominant land use of cultivated fields. Grass covers a small portion (18%), suggesting limited natural or pasture land use. The image depicts a landscape dominated by agricultural fields, likely cultivated with crops, arranged in a regular grid pattern. A prominent ridge or elevated area runs through the center, suggesting a hilly or undulating terrain. This satellite image shows a patchwork of agricultural fields and rural land use, characterized by geometrically shaped plots in varying shades of brown and tan, likely indicating different crops or soil types. A prominent, lighter-toned linear feature—possibly a river, canal, or road—cuts through the landscape, serving as a key geographical element.
0001.png synth 9,2,88,1,0,0,0 The image depicts a landscape dominated by extensive grassland (88%), interspersed with scattered trees (9%) and a small area of crops (1%). A light-colored road and a network of smaller paths cut through the terrain, suggesting a pattern of human access and potentially agricultural activity within the predominantly grassy area. The scene is dominated by grasslands (88%), interspersed with small patches of tree cover (9%) and shrubs (2%), with sparse, fragmented crop areas (1%) suggesting a rural or semi-arid landscape with limited agricultural activity. The scene is predominantly grassland, with 88% coverage, indicating a large area of natural or managed grass cover, with minimal tree, shrub, or crop presence. The image depicts a hilly landscape with a network of agricultural fields, primarily used for pasture or cultivation, intersected by a road and smaller dirt tracks. A meandering stream and forested areas are also visible, suggesting a varied terrain with a mix of human and natural land use This satellite image shows a patchwork of agricultural fields and forested areas, with a prominent river or stream winding through the landscape and a straight road cutting across the terrain, indicating human land use and infrastructure within a rural, possibly hilly, region.
 
Created synth data (300k images) using only train subset (80k images), validation and test set remains unknown (no leakage) for the synth subset.

Code Repository

Python scripts for collecting and preparing data, training and inference models are available here

Files

synth.zip

Files (42.9 GB)

Name Size Download all
md5:0dc95bfdda44a816ade0d7ea747e4f9c
33.4 GB Preview Download
md5:01679231ee8e38701f5d3ab7de0b5719
945.6 MB Preview Download
md5:95cd5caea68c813fd86888f9cd95b627
7.6 GB Preview Download
md5:76e61fd7557d65ba0596e44c0f92b43f
949.9 MB Preview Download

Additional details

Additional titles

Alternative title
Grounding Synthetic Data Generation With Vision and Language Models

Software

Repository URL
https://github.com/caglarmert/ARAS400k
Programming language
Python
Development Status
Active