There is a newer version of the record available.

Published July 5, 2021 | Version v3
Dataset Open

Weather prediction dataset

  • 1. Netherlands eScience Center

Description

Dataset created for machine learning and deep learning training and teaching purposes.
Can for instance be used for classification, regression, and forecasting tasks.
Complex enough to demonstrate realistic issues such as overfitting and unbalanced data, while still remaining intuitively accessible.

ORIGINAL DATA TAKEN FROM:

EUROPEAN CLIMATE ASSESSMENT & DATASET (ECA&D), file created on 22-04-2021
THESE DATA CAN BE USED FREELY PROVIDED THAT THE FOLLOWING SOURCE IS ACKNOWLEDGED:

Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface
air temperature and precipitation series for the European Climate Assessment.
Int. J. of Climatol., 22, 1441-1453.
Data and metadata available at http://www.ecad.eu

 

For more information see metadata.txt file.

The Python code used to create the weather prediction dataset from the ECA&D data can be found on GitHub: https://github.com/florian-huber/weather_prediction_dataset
(this repository also contains Jupyter notebooks with teaching examples)

Versions:

  • v3: added "light" version of the dataset with less features (only 11 locations and fewer variables, reduction from 163 to 89 features) --> This is meant to be used if training times for hands-on session is becoming an issues
  • v2:  now also contains additional `BBQ_weather` labels, the dataset itself has not changed between versions v1 and v2

Files

metadata.txt

Files (5.0 MB)

Name Size Download all
md5:469f459dd7aa7b10a131d59ff1aefc01
4.1 kB Preview Download
md5:2529898e68c46245ebbe0ef3d385b018
394.3 kB Preview Download
md5:94cf8d8d1f6233ebde011d6062b1af5d
2.8 MB Preview Download
md5:ae96c3912f24caa097867e9c4da034d8
1.5 MB Preview Download
md5:40114391d126ec09993b41447d101038
337.8 kB Preview Download