Published May 16, 2023 | Version 1.0
Dataset Open

Sentinel2 RGB chips over Colombia (NE) with JRC GHSL Population Density 2015 for Learning with Label Proportions

  • 1. Universidad de Antioquia
  • 2. Universidad Nacional de Colombia

Description

Region of Interest (ROI) is comprised of the east - northeast region of Colombia covering
parts of Santander, Norte de Santander, Boyacá, Bolívar, Antioquia and Cundinamarca.

We use the communes administrative division defined by DANE (Departamento Administrativo
Nacional de Estadística) under "municipios" in the MGN2021 at 
https://geoportal.dane.gov.co/geovisores/territorio/mgn-marco-geoestadistico-nacional/

images: Sentinel2 RGB from 2020-01-01 to 2020-31-12
        filtered out pixels with clouds during the observation period according to QA60 band following the example
        given in GEE dataset info page, and took the median of the resulting pixels

        see https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED

        see also https://github.com/rramosp/geetiles/blob/main/geetiles/defs/sentinel2rgbmedian2020.py

labels: Global Human Settlement Layers, Population Grid 2015

        labels range from 0 to 31, with the following meaning:
           label value     original value in GEE dataset
           0               0
           1               1-10
           2               11-20
           3               21-30
           ...
           31              >=291 

        see https://developers.google.com/earth-engine/datasets/catalog/JRC_GHSL_P2016_POP_GPW_GLOBE_V1

        see also https://github.com/rramosp/geetiles/blob/main/geetiles/defs/humanpop2015.py

_aschips.geojson    the image chips geometries along with label proportions
                    for easy visualization with QGIS, GeoPandas, etc.

_communes.geojson   the communes geometries with their label prortions
                    for easy visualization with QGIS, GeoPandas, etc.

splits.csv          contains two splits of image chips in train, test, val
                    - with geographical bands at 45° angles in nw-se direction
                    - the same as above reorganized to that all chips within the same
                      commune fall within the same split.

data/               a pickle file for each image chip containing a dict with
                    - the 100x100 RGB sentinel 2 chip image
                    - the 100x100 chip level lavels
                    - the label proportions of the chip
                    - the aggregated label proportions of the commune the chip belongs to

 

Files

colombia-ne_sentinel2-rgb-median-2020_humanpop2015.zip

Files (1.9 GB)