A dataset of Earth Observation Data for Lithological Mapping using Machine Learning
Authors/Creators
- 1. IIT/ NCSR Demokritos , Greece
- 2. School of Rural, Surveying and Geo-Informatics Engineering/ NTUA, Greece
Description
Dataset Information
Machine Learning (ML) algorithms had successfully contributed in the creation of automated methods of recognizing patterns in high-dimensional data. Remote sensing data covers wide geographical areas and could be used to solve the problem of the demand of various in-situ data. Lithologicall mapping using remotely sensed data is one of the most challenging applications of ML algorithms. In the framework of the “AI for Geoapplications” project , ML and especially Deep Learning (DL) methodologies are investigated for the identification and characterization of the lithology based on remote sensing data in various pilot areas in Greece. In order to train and test the various ML algorithms, a dataset consisting of 30 ROIs selected mainly from low -vegetated areas, that cover 2% of the total area of Greece was created
Dataset Preprocessing
Dataset preprocessing was executed using a combination of SNAP, QGIS and ENVI tools.
Preprocessing steps:
Defining areas with the following properties:
-
Zero cloud and snow coverage
-
No water bodies
-
Minimum vegetation
For the Aster Images:
-
Subset on defined areas
-
Mosaic images when needed
-
Digitising clouds
For the Labels:
-
We got the Soil map from YPEN (https://ypen.gov.gr/)
-
Subset on defined areas
-
All categories are represented with good analogies
-
Clip label files with digitised clouds
-
Rasterize
For the Labels we have eighteen categories for the twenty-eight areas that we collected data. We use the following coding for the Labels of our Dataset:
|
Alluvial deposits |
0 |
|
Limestone colluvial deposits |
1 |
|
Limestones |
2 |
|
Schists |
3 |
|
Quaternary sediments |
4 |
|
Gneiss |
5 |
|
Slope fan debris |
6 |
|
Mixed flysch |
7 |
|
Flysch shale and cherts |
8 |
|
Dolomites |
9 |
|
Granite |
10 |
|
Sandstone flysch |
11 |
|
Flysch colluvial deposits |
12 |
|
Peridotite and Gabbro |
13 |
|
River bed deposits |
14 |
|
Gneiss colluvial deposits |
15 |
|
Not available |
-100 |
|
cloud coverage |
-999 |
The following table lists the available areas and the categories that each contains: Lithology_Dataset
For the Sentinel-2 images, we made the following process:
-
Resampling 10m
-
Subset on defined areas
The Sentinel-2 map contains: Sentinel 2 false colour composite 11/8/4 with OSM background
The Final step is the collocation of the previous into a datacube i.e a multidimensional array with 25 bands (datacube dimensions differentiate for every area) using the Aster image as base (15m spatial resolution).
-
Bands 1-14: Aster
-
Bands 15-24: S2
-
Band 25: Label
The code for preprocessing the dataset in order to be used for machine learning algorithms can be found in the following link:
https://github.com/georgegiannop/Lithology
Citation
If you use this dataset in your work, please cite our paper:
Vernikos, I., Giannopoulos, G., Christopoulou, A., Begaj, A., Stefouli, M., Bratsolis, E., and Charou, E.: A dataset of Earth Observation Data for Lithological Mapping using Machine Learning, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-17570, https://doi.org/10.5194/egusphere-egu23-17570, 2023.
Files
areas_greece.png
Files
(908.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:f87d862cc41187ca08906b1ff5f5c21d
|
11.1 MB | Preview Download |
|
md5:1329499e81af4fa174c42cac1bb76f64
|
326.0 kB | Preview Download |
|
md5:6ee29276d3ce60b9acf98a4f11da807f
|
294.7 kB | Preview Download |
|
md5:7b46f8e82c955d3ba6bd6f547f8da8a0
|
896.3 MB | Preview Download |
|
md5:584f591f040a6d1b50504837dceaa830
|
9.2 kB | Download |
Additional details
Related works
- Is described by
- Conference paper: 10.5194/egusphere-egu23-17570 (DOI)