Published December 5, 2024 | Version v1
Dataset Open

AI4AGRI Sentinel-2 Brasov area 2020-2024 multi-spectral dataset for crop monitoring and identification

Description

The database contains all the available images from Sentinel-2 MSI from 2020 to 2024 over a specific area to the north of Brasov city, Romania, together with the 32 x 32 pixel multi-spectral patches for a crop identification task using a learning model. For the crop identification task, we propose the usage of the dataset for the following two concrete problems: (1) crop identification with temporal generalization in the learning model (i.e. training with 2020-2023 data and testing with 2024 data) and (2) early crop identification (i.e. crop identification during the vegetation season, considering an arbitrarily-chosen date of 20th May in the middle of the season, for the splitting of the data into training and testing). 

The directory Images_Sentinel2_GeoTIFF containts all the Sentinel-2 images over the area of interest. All the images respect a naming convention and are saved in GeoTIFF format. The dimensions of the images are 800 x 450 x 12, specifically, each image has a height of 800 pixels, a width of 450 pixels and 12 spectral bands, with a spatial resolution of 10 x 10 meters. Each image is saved in the folder corresponding to the year when it was acquired by the Sentinel-2 MSI. The folders where the images are saved have the name Sentinel2_yyyy, where yyyy is the four-digit year.

In addition to the multispectral images, the database contains the ground truth of agricultural crops as RGB masks in PNG format and the masks with labels corresponding to each agricultural crop in both PNG and MAT formats. These are located in the Masks_and_legend directory. This directory also contains the legend for the masks in PDF format and the 5 subdirectories where the masks for each year are stored. The subdirectories are named Sentinel2_yyyy, where yyyy represents the four characters for the corresponding year.

From the afore-mentioned Sentinel-2 images, we created multi-spectral patches with dimensions 32 x 32 x 12, which are stored in the database under the directory 32x32_patches. The 32x32_patches directory contains two subdirectories: 32x32_multispectral_patches and 32x32_RGB_patches (the latter ones purely for visualization). The first subdirectory contains multispectral data used for solving the two problems, with the subdirectory problem1 corresponding to the classification of agricultural crops and problem2 corresponding to the early identification of agricultural crops. The data for each problem is divided into training and test sets. The patches in these directories are saved in both GEOTIFF and MAT formats, as indicated by the names of the subdirectories where they are stored: patches_mat and patches_tiff.  The directory 32x32_RGB_patches contains the patches generated from the RGB masks, as a ground truth for the labels at pixel level. 

Files

AI4AGRI Sentinel-2 Brasov area 2020-2024 dataset.zip

Files (1.1 GB)

Additional details

Additional titles

Alternative title
muDACIA5

Funding

AI4AGRI – Romanian Excellence Center on Artificial Intelligence in Earth Observation Data for Agriculture 101079136
European Commission