There is a newer version of the record available.

Published October 30, 2023 | Version 1.0.0
Dataset Open

Open-source traffic and CO2 emission dataset for commercial aviation

  • 1. ISAE-SUPAERO
  • 2. ROR icon Delft University of Technology
  • 3. ROR icon École Supérieure de Commerce de Toulouse

Description

[Deprecated version, used in the support article, please download the last version]

 

This record is a global open-source passenger air traffic dataset primarily dedicated to the research community. 
It gives a seating capacity available on each origin-destination route for a given year, 2019, and the associated aircraft and airline when this information is available. 

Context on the original work is given in the related article (https://journals.open.tudelft.nl/joas/article/download/7201/5683) and on the associated GitHub page (https://github.com/AeroMAPS/AeroSCOPE/).
A simple data exploration interface will be available at www.aeromaps.eu/aeroscope.
The dataset was created by aggregating various available open-source databases with limited geographical coverage. It was then completed using a route database created by parsing Wikipedia and Wikidata, on which the traffic volume was estimated using a machine learning algorithm (XGBoost) trained using traffic and socio-economical data.
 


1- DISCLAIMER


The dataset was gathered to allow highly aggregated analyses of the air traffic, at the continental or country levels. At the route level, the accuracy is limited as mentioned in the associated article and improper usage could lead to erroneous analyses. 


2- DESCRIPTION

Each data entry represents an (Origin-Destination-Operator-Aircraft type) tuple.

Please refer to the support article for more details (see above).

The dataset contains the following columns:

  • "First column" : index
  • airline_iata : IATA code of the operator in nominal cases. An ICAO -> IATA code conversion was performed for some sources, and the ICAO code was kept if no match was found.
  • acft_icao : ICAO code of the aircraft type
  • acft_class : Aircraft class identifier, own classification.
    • WB: Wide Body
    • NB: Narrow Body
    • RJ: Regional Jet
    • PJ: Private Jet
    • TP: Turbo Propeller
    • PP: Piston Propeller
    • HE: Helicopter
    • OTHER
  • seymour_proxy: Aircraft code for Seymour Surrogate (https://doi.org/10.1016/j.trd.2020.102528), own classification to derive proxy aircraft when nominal aircraft type unavailable in the aircraft performance model.
  • source: Original data source for the record, before compilation and enrichment.
    • ANAC: Brasilian Civil Aviation Authorities
    • AUS Stats: Australian Civil Aviation Authorities
    • BTS: US Bureau of Transportation Statistics T100
    • Estimation: Own model, estimation on Wikipedia-parsed route database
    • Eurocontrol: Aggregation and enrichment of R&D database
    • OpenSky
    • World Bank
  • seats: Number of seats available for the data entry, AFTER airport residual scaling
  • n_flights: Number of flights of the data entry, when available
  • iata_departure, iata_arrival : IATA code of the origin and destination airports. Some BTS inhouse identifiers could remain but it is marginal.
  • departure_lon, departure_lat, arrival_lon, arrival_lat : Origin and destination coordinates, could be NaN if the IATA identifier is erroneous
  • departure_country, arrival_country: Origin and destination country ISO2 code. WARNING: disable NA (Namibia) as default NaN at import
  • departure_continent, arrival_continent: Origin and destination continent code. WARNING: disable NA (North America) as default NaN at import
  • seats_no_est_scaling: Number of seats available for the data entry, BEFORE airport residual scaling
  • distance_km: Flight distance (km)
  • ask: Available Seat Kilometres
  • rpk: Revenue Passenger Kilometres (simple calculation from ASK using IATA average load factor)
  • fuel_burn_seymour: Fuel burn per flight (kg) when seymour proxy available
  • fuel_burn: Total fuel burn of the data entry (kg)
  • co2: Total CO2 emissions of the data entry (kg)
  • domestic: Domestic/international boolean (Domestic=1, International=0)

 

3- Citation

Please cite the support paper instead of the dataset itself. 

Salgas, A., Sun, J., Delbecq, S., Planès, T., & Lafforgue, G. (2023). Compilation of an open-source traffic and CO2 emissions dataset for commercial aviation. Journal of Open Aviation Science. https://doi.org/10.59490/joas.2023.7201

Files

AeroSCOPE_global_aviation_traffic_dataset_26_09.csv

Files (66.6 MB)

Additional details

Related works

Is described by
Conference paper: 10.59490/joas.2023.7201 (DOI)

Dates

Created
2023-10-30