Published April 30, 2024 | Version v1
Dataset Open

ExioML: Global Eco-economic Scope 3 Emission Machine Learning Dataset

  • 1. ROR icon University of Sydney

Description

🙋‍♂️ Introduction

ExioML is the first ML-ready benchmark dataset in eco-economic research, designed for global sectoral sustainability analysis. It addresses significant research gaps by leveraging the high-quality, open-source EE-MRIO dataset ExioBase 3.8.2. ExioML covers 163 sectors across 49 regions from 1995 to 2022, overcoming data inaccessibility issues. The dataset includes both factor accounting in tabular format and footprint networks in graph structure.

We demonstrate a GHG emission regression task using a factor accounting table, comparing the performance of shallow and deep models. The results show a low Mean Squared Error (MSE), quantifying sectoral GHG emissions in terms of value-added, employment, and energy consumption, validating the dataset's usability. The footprint network in ExioML, inherent in the multi-dimensional MRIO framework, enables tracking resource flow between international sectors.

ExioML offers promising research opportunities, such as predicting embodied emissions through international trade, estimating regional sustainability transitions, and analyzing the topological changes in global trading networks over time. It reduces barriers and intensive data pre-processing for ML researchers, facilitates the integration of ML and eco-economic research, and provides new perspectives for sound climate policy and global sustainable development.

📊 Dataset

ExioML supports graph and tabular structure learning algorithms through the Footprint Network and Factor Accounting table. The dataset includes the following factors in PxP and IxI:

- Region (Categorical feature)
- Sector (Categorical feature)
- Value Added [M.EUR] (Numerical feature)
- Employment [1000 p.] (Numerical feature)
- GHG emissions [kg CO2 eq.] (Numerical feature)
- Energy Carrier Net Total [TJ] (Numerical feature)
- Year (Numerical feature)

☁️ Factor Accounting

The Factor Accounting table shares common features with the Footprint Network and summarizes the total heterogeneous characteristics of various sectors.

🚞 Footprint Network

The Footprint Network models the high-dimensional global trading network, capturing its economic, social, and environmental impacts. This network is structured as a directed graph, where directionality represents sectoral input-output relationships, delineating sectors by their roles as sources (exporting) and targets (importing). The basic element in the ExioML Footprint Network is international trade across different sectors with features such as value-added, emission amount, and energy input. The Footprint Network helps identify critical sectors and paths for sustainability management and optimization. The Footprint Network is hosted on Zenodo.

🔗 Code and Data Availability

The ExioML development toolkit in Python and the regression model used for validation are available on the GitHub repository: (https://github.com/YVNMINC/ExioML). The complete ExioML dataset is hosted by Zenodo: (https://zenodo.org/records/10604610).

💡 Additional Information

More details about the dataset are available in our paper: *ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability*, accepted by the ICLR 2024 Climate Change AI workshop: (https://arxiv.org/abs/2406.09046).

📄 Citation

@inproceedings{guo2024exioml,
  title={ExioML: Eco-economic dataset for Machine Learning in Global Sectoral Sustainability},
  author={Yanming, Guo and Jin, Ma},
  booktitle={ICLR 2024 Workshop on Tackling Climate Change with Machine Learning},
  year={2024}
}

🌟 Reference

Stadler, Konstantin, et al. "EXIOBASE 3." Zenodo. Retrieved March 22 (2021): 2023.

Files

ExioML_factor_accounting_IxI.csv

Files (2.8 GB)

Name Size Download all
md5:5d58fc527f63c3d6452e87d7f9417234
18.0 MB Preview Download
md5:e82255c38635d533c263ba980c3ec1bf
21.6 MB Preview Download
md5:3d3b1990ff95c62fd52314f4ad056c03
1.3 GB Preview Download
md5:06d86e7e216461d8fddac12e5f400e74
1.4 GB Preview Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.3583070 (DOI)

Software

References

  • EXIOBASE 3 (Stadler et al. 2018 DOI: 10.5281/zenodo.3583070)