There is a newer version of the record available.

Published February 24, 2025 | Version v1
Dataset Open

AI-Powered Food Sustainability: Exploiting Knowledge Graphs for Reducing Carbon Footprints and Land Use

  • 1. ROR icon University of Twente
  • 2. ROR icon Eindhoven University of Technology

Description

Overview


This README documents the datasets and RDF graph used in the research article "AI-Powered Food Sustainability: Exploiting Knowledge Graphs for Reducing Carbon Footprints and Land Use" by Anand K. Gavai, Suniti Vadalkar, and Mahak Sharma. The study employs AI-driven knowledge graphs to analyze the environmental impacts of food items, focusing on protein sources, and to propose sustainability interventions aligned with the United Nations Sustainable Development Goals (SDGs). The datasets and RDF graph provided here support the construction and querying of the knowledge graph for sustainability analysis.


All data and associated source code are publicly available in a Zenodo repository:
DOI: 10.5281/zenodo.10143973


Data Description


The datasets consist of structured environmental data for various food items, integrated into a knowledge graph to assess sustainability metrics such as carbon footprint, land use, water use, scarcity-weighted water use, and eutrophication. The data primarily focuses on global averages from 2010, sourced from Poore & Nemecek (2018), with an emphasis on protein-rich foods (e.g., beef, cheese, legumes) and other dietary staples.


Sources


The data were sourced from:


  1. Poore & Nemecek (2018): A comprehensive study on the environmental impacts of food production, providing global metrics for greenhouse gas (GHG) emissions, land use, and freshwater withdrawals.
    • Citation: Poore, J., Nemecek, T., 2018. Reducing food’s environmental impacts through producers and consumers. Science 360, 987–992. DOI: 10.1126/science.aaq0216

  1. OurWorldInData.org: Supplies additional sustainability metrics, including scarcity-weighted water use and eutrophication, complementing the Poore & Nemecek dataset.

File Formats and Contents


The repository includes CSV files and an RDF graph in Turtle format:


CSV Files


Five CSV files provide environmental metrics for 38 food items, focusing on 2010 global averages:


  1. GHG Emissions (Two Files)
    • Filename: ghg_emissions_per_kg.csv (two identical versions provided)

    • Columns: Entity (food item), Year (2010), GHG emissions per kilogram (Poore & Nemecek, 2018) (kg CO₂-equivalent per kg)

    • Example: Beef (dairy herd), 2010, 33.30

  1. Freshwater Withdrawals
    • Filename: freshwater_withdrawals_per_kg.csv

    • Columns: Entity (food item), Year (2010), Freshwater withdrawals per kilogram (Poore & Nemecek, 2018) (liters per kg)

    • Example: Cheese, 2010, 5605.2

  1. Land Use
    • Filename: land_use_per_kg.csv

    • Columns: Entity (food item), Year (2010), Land use per kilogram (Poore & Nemecek, 2018) (m² per kg)

    • Example: Nuts, 2010, 12.96

  1. Comprehensive Metrics (Protein Sources)
    • Filename: protein_source_metrics.csv

    • Columns: Food (food item), CarbonFootprint (kg CO₂-eq per kg), LandUse (m² per kg), WaterUse (liters per kg), Scarcity_weighted water use (liters per kg), Eutrophication (g PO₄-eq per kg)

    • Example: Eggs, 4.67, 6.27, 578, 17983, 21.76

RDF Graph


  • Filename: food_emissions_graph.ttl

  • Format: Turtle (TTL)

  • Description: A knowledge graph representing a subset of food items (e.g., Beef, Cheese, Eggs) with their environmental metrics as properties.

  • Prefixes:
    • : <http://example.org/food-emissions#> (custom ontology)

    • rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

    • rdfs: <http://www.w3.org/2000/01/rdf-schema#>

  • Structure:
    • Nodes: Food items (e.g., :Beef) typed as :FoodItem.

    • Properties:
      • :hasCarbonFootprint (kg CO₂-eq per kg)

      • :hasLandUse (m² per kg)

      • :hasWaterUse (liters per kg)

      • :hasScarcityWeightedWaterUse (liters per kg)

      • :hasEutrophication (g PO₄-eq per kg)

    • Example:
      turtle

      WrapCopy

      :Beef rdf:type :FoodItem ; :hasCarbonFootprint 33.30 ; :hasLandUse 43.24 ; :hasWaterUse 2714 ; :hasScarcityWeightedWaterUse 119805 ; :hasEutrophication 365.29 .

Key Metrics


The datasets cover the following environmental impact metrics:


  • Carbon Footprint: GHG emissions in kg CO₂-equivalent per kg of food.

  • Land Use: Area in square meters (m²) required per kg of food.

  • Water Use: Freshwater withdrawals in liters per kg of food.

  • Scarcity-Weighted Water Use: Water use adjusted for regional scarcity, in liters per kg (available for select protein sources).

  • Eutrophication: Nutrient pollution in grams of phosphate-equivalent (g PO₄-eq) per kg (available for select protein sources).

Usage


These datasets and the RDF graph were used to:


  1. Build an AI-driven knowledge graph for real-time sustainability analysis of food items.

  1. Enable SPARQL queries to rank food items by environmental impact (e.g., identifying low-carbon protein sources like nuts or peas).

  1. Compare traditional protein sources (e.g., beef, cheese) with alternatives (e.g., tofu, soy milk).

  1. Support policy recommendations, such as taxing high-impact foods or promoting sustainable alternatives.

Example applications:


  • Querying :hasCarbonFootprint to identify that beef (99.48 kg CO₂-eq/kg) far exceeds nuts (0.43 kg CO₂-eq/kg).

  • Assessing trade-offs, e.g., cheese’s high water use (5605 liters/kg) vs. soy milk’s low water use (27.8 liters/kg).

Access and Availability


All datasets and the RDF graph are available in the Zenodo repository:




The repository includes:


  • Raw CSV files (ghg_emissions_per_kg.csv, freshwater_withdrawals_per_kg.csv, land_use_per_kg.csv, protein_source_metrics.csv).

  • RDF graph file (food_emissions_graph.ttl).

  • Scripts for data integration and knowledge graph construction (see repository for details).

Limitations


  • Data Scope: Focuses on 2010 global averages, lacking regional or temporal variations.

  • Completeness: Scarcity-weighted water use and eutrophication metrics are available only for a subset of protein sources.

  • Static Nature: Reflects a snapshot from Poore & Nemecek (2018), not real-time data.

  • RDF Coverage: The provided RDF graph includes only 8 food items; the full graph in the study may cover more.

Funding


This work was supported by the ‘High Tech for a Sustainable Future’ capacity building programme of the 4TU Federation in the Netherlands.


Contact


For questions or further information, please contact the corresponding author:


  • Name: Anand K. Gavai


  • Affiliation: Industrial Engineering & Business Information Systems, University of Twente, Enschede, The Netherlands

Citation


If you use this dataset or RDF graph, please cite the original manuscript:
Gavai, A.K., Vadalkar, S., Sharma, M. (2025). AI-Powered Food Sustainability: Exploiting Knowledge Graphs for Reducing Carbon Footprints and Land Use.

Files

food-emissions-supply-chain.csv

Files (5.5 kB)

Name Size Download all
md5:abb7b7219df3fe67a42687900b344041
524 Bytes Preview Download
md5:8d9a421ce29b17f5696148fe1db14498
1.6 kB Download
md5:d425205b5d4177bbb11242a3cdc87632
881 Bytes Preview Download
md5:6f061aba31c279649dd527ebb334c2b2
843 Bytes Preview Download
md5:6f061aba31c279649dd527ebb334c2b2
843 Bytes Preview Download
md5:a765a918cecc123950fda349b9914f72
842 Bytes Preview Download

Additional details

Related works

Is supplement to
Dataset: 10.2139/ssrn.5175322 (DOI)

Software

Programming language
Python