Dataset Open Access

KDD Cup Dataset (without Missing Values)

Godahewa, Rakshitha; Bergmeir, Christoph; Webb, Geoff; Hyndman, Rob; Montero-Manso, Pablo

This dataset was used in the KDD Cup 2018 forecasting competition. It contains long hourly time series representing the air quality levels in 59 stations in 2 cities: Beijing (35 stations) and London (24 stations) from 01/01/2017 to 31/03/2018. The air quality level is represented in multiple measurements such as PM2.5, PM10, NO2, CO, O3 and SO2.

The dataset uploaded here contains 270 hourly time series which have been categorized using city, station name and air quality measurement. 

The original dataset contains missing values. The leading missing values of a given series were replaced by zeros and the remaining missing values were replaced by carrying forward the corresponding last observations (LOCF method). 

Files (2.4 MB)
Name Size
kdd_cup_2018_dataset_without_missing_values.zip
md5:b6fdac5404f3a3a338932aa78637214d
2.4 MB Download
  • Kdd cup 2018. URL https://www.kdd.org/kdd2018/kdd-cup

135
731
views
downloads
All versions This version
Views 13575
Downloads 731722
Data volume 1.8 GB1.8 GB
Unique views 11667
Unique downloads 118110

Share

Cite as