Published February 3, 2022 | Version 1.2
Dataset Open

TimeSpec4LULC: A Smart-Global Dataset of Multi-Spectral Time Series of MODIS Terra-Aqua from 2000 to 2021 for Training Machine Learning models to perform LULC Mapping

  • 1. Dept. of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence, DaSCI, University of Granada, 18071, Granada, Spain
  • 2. Dept. of Botany, Faculty of Science, University of Granada, 18071 Granada, Spain. iEcolab, Inter-University Institute for Earth System Research, University of Granada, 18006 Granada, Spain
  • 3. Multidisciplinary Institute for Environment Studies "Ramón Margalef", University of Alicante, 03690, Spain

Contributors

  • 1. Dept. of Computer Science and Artificial Intelligence, Andalusian Research Institute in Data Science and Computational Intelligence, DaSCI, University of Granada, 18071, Granada, Spain
  • 2. ENSIAS, Mohammed V University, Rabat, 10170, Morocco

Description

TimeSpec4LULC is a smart open-source global dataset of multi-spectral time series for 29 Land Use and Land Cover (LULC) classes ready to train machine learning models. It was built based on the seven spectral bands of the MODIS sensors at 500 m resolution from 2000 to 2021 (262 observations in each time series). Then, was annotated using spatial-temporal agreement across the 15 global LULC products available in Google Earth Engine (GEE).

TimeSpec4LULC contains two datasets: the original dataset distributed over 6,076,531 pixels, and the balanced subset of the original dataset distributed over 29000 pixels.

The original dataset contains 30 folders, namely "Metadata", and 29 folders corresponding to the 29 LULC classes. The folder "Metadata" holds 29 different CSV files describing the metadata of the 29 LULC classes. The remaining 29 folders contain the time series data for the 29 LULC classes. Each folder holds 262 CSV files corresponding to the 262 months. Inside each CSV file, we provide the seven values of the spectral bands as well as the coordinates for all the LULC class-related pixels.

The balanced subset of the original dataset contains the metadata and the time series data for 1000 pixels per class representative of the globe. It holds 29 different JSON files following the names of the 29 LULC classes.

The features of the dataset are:

- ".geo": the geometry and coordinates (longitude and latitude) of the pixel center.

- "ADM0_Code": the GAUL country code.

- "ADM1_Code": the GAUL first-level administrative unit code.

- GHM_Index": the average of the global human modification index.

- "Products_Agreement_Percentage": the agreement percentage over the 15 global LULC products available in GEE.

- "Temporal_Availability_Percentage": the percentage of non-missing values in each band.

- "Pixel_TS": the time series values of the seven spectral bands.

Notes

This research has been supported by DETECTOR (A-RNM-256-UGR18 Universidad de Granada/FEDER), LifeWatch SmartEcomountains (LifeWatch-2019-10-UGR-01 Ministerio de Ciencia e Innovación/Universidad de Granada/FEDER), BBVA DeepSCOP (Ayudas Fundación BBVA a Equipos de Investigación Científica 2018), Ramón y Cajal Programme (RYC-2015-18136), DeepL-ISCO (A-TIC-458-UGR18 Ministerio de Ciencia e Innovación/FEDER), SMART-DASCI (TIN2017-89517-P Ministerio de Ciencia e Innovación/Universidad de Granada/FEDER), BigDDL-CET (P18-FR-4961 Ministerio de Ciencia e Innovación/Universidad de Granada/FEDER), RESISTE (P18-RT-1927 Consejería de Economía, Conocimiento, y Universidad from the Junta de Andalucía/FEDER), and Ecopotential (641762 European Commission).

Files

Data_structure_description.zip

Files (59.7 GB)

Name Size Download all
md5:d80543bf418e9197342d4dee16bfaf34
717.9 kB Preview Download
md5:db90e19645c25b17be3a57ce09cbd167
134.7 MB Preview Download
md5:6cebc00412f260a05b826f539d54c77e
59.5 GB Preview Download

Additional details

Funding

European Commission
ECOPOTENTIAL - ECOPOTENTIAL: IMPROVING FUTURE ECOSYSTEM BENEFITS THROUGH EARTH OBSERVATIONS 641762

References

  • Khaldi, R., Alcaraz-Segura, D., Guirado, E., Benhammou, Y., El Afia, A., Herrera, F., and Tabik, S. (2021).TimeSpec4LULC: A Global Multispectral Time Series Database for Training LULC Mapping Models with Machine Learning, Earth Syst. Sci. Data Discuss. [preprint].