Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.

There is a newer version of the record available.

Published December 19, 2023 | Version 0.1
Software Open

spatial.IO - An integrated cloud-ready geospatial data management system

Description

A Spatial Data Infrastructure for netCDF files and more

A Spatial Data Infrastructure (SDI) is a combination of policies, standards and software to manage and deliver geospatial data (Simmons, 2018). A good SDI follows policies and standards that are (widely) accepted in the communities (e.g. FAIROGC). Although often providing new functionality, the main advantage of an SDI is the connection of different tools and software products to build (mostly) automated workflows. This allows for less manual processing (and therefore fewer errors) as well as standardized data products due to fixed workflows. For this to work flawlessly, extensive documentation and user instructions are key. An SDI can contain (but is not limited to) data storage, metadata catalogue, tools for data processing, WebGIS and a form of data access (e.g. download, web service).

Climate modelling and research increasingly produce and share the data standard of netCDF files, leading to an increasing demand in automated management of netCDF data. This application aims to provide automated workflows to manage standardized netCDF data and display them in an interactive WebGIS. The standard specifications follow the Binding Regulations for Storing Data as netCDF Files. Other vector and raster data formats ((Cloud optimized) GeoTIFF, sensor data) can be included in the workflows with manual work-steps and will be automated in the next versions. The application will be expanded continuously into a self-service platform to create custom WebGIS, automated workflows and various (meta-)data provision interfaces for a wide range of spatial data formats.

Requirements

  • Simply FAIR
  • Science- and Management friendly: Provide interoperable and reliable netCDF data enriched by metadata and with provenance information.
  • User friendly: Easy to use user interface for people that manage netCDF data or create WebGIS for netCDF data, without requiring knowledge about underlying technologies like databases.
  • Admin friendly: A scalable and transferable container based solution that will smoothly integrate into typical scientific IT landscapes.
  • Developer friendly: Common open source solutions structured by microservice architecture to keep it open and simple to extend for developers.

Features

  • S3 cloud-storage with MinIO
  • FROST®-Server to store and access sensor data (in combination with timeIO and SaQC)
  • Creation of custom interactive WebGIS components for netCDF, STA and GeoTIFF data
  • Extendable processes to get spatially aggregated values for netCDF and GeoTIFF data
  • Use of django framework to make configuration of data and WebGIS user-friendly
  • Workflow for automated creation of OGC web services with GeoServer of new netCDF data
  • Workflow for automated creation of metadata entries in GeoNetwork
  • THREDDS Data Server (TDS) to provide netCDF data with OPeNDAP
Component Description Supported Data Formats
MinIO S3 Storage
  • any format
FROST®-Server Server to store and provide sensor data with OGC SensorThings API
  • csv
WebGIS

Online viewer to show data with additional funtionality:

  • Swiper function to compare two datasets
  • Time slider
  • Selectable federal countries and districts
  • Upload of Shapefile
  • Function to aggregate values over polygon (country, district, shapefile)
  • Diagramm to show multiple datasets values as time series
  • Option to filter datasets by variable
  • NetCDF
  • GeoTIFF/Cloud Optimized GeoTIFF (COG) as ImageMosaic
  • sensor data as csv
AggregationAPI

pygeoapi instance for OGC API - Processes to process aggregated values for WebGIS

  • NetCDF
  • GeoTIFF/COG
Admin-Frontend
  • Manage data from MinIO in projects
    • Connect data to GeoServer, GeoNetwork, TDS
  • Create WebGIS instances and manage data/design of viewer
-
GeoServer OGC Web Services for data
  • NetCDF
  • GeoTIFF/COG (manual ImageMosaic creation needed)
GeoNetwork Metadata catalogue and OGC CSW access with direct data download link
  • NetCDF
  • Metadata provided as external JSON file
THREDDS Data Server (TDS) OPeNDAP access
  • NetCDF
django backend
  • Manage requests from user-components
  • Hold configuration from Admin-Frontend
-
PostgreSQL Database to store values and information for all SDI components to communicate seemlessly -
Worker
  • Runs in the background of the application and watches for new data/changes in MinIO
  • Creates datastores in GeoServer for new data to provide OGC Web Services
  • Link new MinIO data to TDS
  • Create/Update metadata entry for MinIO data in GeoNetwork 
-

Quickstart

  1. Install Docker Engine (Community Edition - CE is enough) and Docker Compose
  2. Install a git client and checkout the spatialIO repository or unzip the attached archive
  3. Follow step-by-step instructions in README.md and README_FRONTEND.md to startup

Techstack, dependencies and third party open source products

Acknowledgements

We thank the Helmholtz Association and the Federal Ministry of Education and Research (BMBF) for supporting the DataHub Initiative of the Research Field Earth and Environment. The DataHub enables an overarching and comprehensive research data management, following FAIR principles, for all Topics in the Program Changing Earth – Sustaining our Future.

Files

spatialio-v0.1-figure.png

Files (20.8 MB)

Name Size Download all
md5:68e297da7e3bf41bc48495a51fdf380e
92.1 kB Preview Download
md5:c3bf9c9bf320d0f888137c2d3255cbf4
20.7 MB Preview Download

Additional details

Related works

References
Software: 10.5281/zenodo.8354840 (DOI)
Software: 10.5281/zenodo.8320044 (DOI)

Funding

Federal Ministry of Education and Research
Helmholtz Association of German Research Centres