ARAS400k: A Large-Scale Remote Sensing Dataset Augmented with Synthetic Data for Segmentation and Captioning
Description
ARAS400k
We introduce the fully open-sourced ARAS400k, a comprehensive remote sensing dataset consisting of 100,240 real images and 300,000 synthetic images. Each image is 256x256 and paired with semantic segmentation maps and 5 descriptive captions. ARAS400k contains 400,240 images and 2,001,200 descriptive captions.
| Subset | True Color Images | Segmentation Maps | Captions |
| Train | 80,192 | 80,192 | 400,960 |
| Validation | 10,024 | 10,024 | 50,120 |
| Test | 10,024 | 10,024 | 50,120 |
| Real Total | 100,240 | 100,240 | 501,200 |
| Synthetic | 300,000 | 300,000 | 1,500,000 |
| ARAS400k Total | 400,240 | 400,240 | 2,001,200 |
ARAS400k folder contents
- Subset (train,val,test,synth)
- images
- 0000.png
- 0001.png
- ...
- masks
- 0000.png
- 0001.png
- ...
- captions.csv
- images
An example from synth folder, captions.csv containing filename (unique) split (train-val-test-synth), class percentages (Tree, Shrub, Grass, Crop, Built-up, Barren, Water) and captions for each method: hybrid_gemma3-4b, hybrid_qwen3-vl-8b, text_qwen3-4b, vision_gemma3-4b, vision_qwen3-vl-8b
| filename | split |
Tree, Shrub, Grass, Crop, Built-up, Barren, Water |
hybrid_gemma3-4b | hybrid_qwen3-vl-8b | text_qwen3-4b | vision_gemma3-4b | vision_qwen3-vl-8b |
| 0000.png | synth | 0,0,18,82,0,0,0 | The image depicts a landscape dominated by cultivated crops (82%), interspersed with smaller areas of grass (18%). The scene exhibits a patterned arrangement of fields, suggesting agricultural land use with some topographic variation. | The scene is dominated by crop fields, covering 82% of the area in a patchwork of geometrically shaped agricultural plots, with grasslands making up the remaining 18% in interspersed, smaller areas. | The scene is primarily agricultural, with 82% of the area covered by crops, indicating a dominant land use of cultivated fields. Grass covers a small portion (18%), suggesting limited natural or pasture land use. | The image depicts a landscape dominated by agricultural fields, likely cultivated with crops, arranged in a regular grid pattern. A prominent ridge or elevated area runs through the center, suggesting a hilly or undulating terrain. | This satellite image shows a patchwork of agricultural fields and rural land use, characterized by geometrically shaped plots in varying shades of brown and tan, likely indicating different crops or soil types. A prominent, lighter-toned linear feature—possibly a river, canal, or road—cuts through the landscape, serving as a key geographical element. |
| 0001.png | synth | 9,2,88,1,0,0,0 | The image depicts a landscape dominated by extensive grassland (88%), interspersed with scattered trees (9%) and a small area of crops (1%). A light-colored road and a network of smaller paths cut through the terrain, suggesting a pattern of human access and potentially agricultural activity within the predominantly grassy area. | The scene is dominated by grasslands (88%), interspersed with small patches of tree cover (9%) and shrubs (2%), with sparse, fragmented crop areas (1%) suggesting a rural or semi-arid landscape with limited agricultural activity. | The scene is predominantly grassland, with 88% coverage, indicating a large area of natural or managed grass cover, with minimal tree, shrub, or crop presence. | The image depicts a hilly landscape with a network of agricultural fields, primarily used for pasture or cultivation, intersected by a road and smaller dirt tracks. A meandering stream and forested areas are also visible, suggesting a varied terrain with a mix of human and natural land use | This satellite image shows a patchwork of agricultural fields and forested areas, with a prominent river or stream winding through the landscape and a straight road cutting across the terrain, indicating human land use and infrastructure within a rural, possibly hilly, region. |
Code Repository
Python scripts for collecting and preparing data, training and inference models are available here.
Files
synth.zip
Additional details
Additional titles
- Alternative title
- Grounding Synthetic Data Generation With Vision and Language Models
Software
- Repository URL
- https://github.com/caglarmert/ARAS400k
- Programming language
- Python
- Development Status
- Active