There is a newer version of the record available.

Published April 22, 2020 | Version 1
Dataset Open

Pre-Processed Power Grid Frequency Time Series

  • 1. Forschungszentrum Jülich GmbH, Institute for Energy and Climate Research - Systems Analysis and Technology Evaluation (IEK-STE), 52428 Jülich, Germany
  • 2. School of Mathematical Sciences, Queen Mary University of London, London E1 4NS, United Kingdom

Description

Overview
This repository contains ready-to-use frequency time series as well as the corresponding pre-processing scripts in python. The data covers three synchronous areas of the European power grid:

  • Continental Europe
  • Great Britain
  • Nordic

This work is part of the paper "Predictability of Power Grid Frequency"[1]. Please cite this paper, when using the data and the code. For a detailed documentation of the pre-processing procedure we refer to the supplementary material of the paper.

Data sources
We downloaded the frequency recordings from publically available repositories of three different Transmission System Operators (TSOs).

  • Continental Europe [2]: We downloaded the data from the German TSO TransnetBW GmbH, which retains the Copyright on the data, but allows to re-publish it upon request [3].
  • Great Britain [4]: The download was supported by National Grid ESO Open Data, which belongs to the British TSO National Grid. They publish the frequency recordings under the NGESO Open License [5].
  • Nordic [6]: We obtained the data from the Finish TSO Fingrid, which provides the data under the open license CC-BY 4.0 [7].

Content of the repository

A) Scripts

  1. In the "Download_scripts" folder you will find three scripts to automatically download frequency data from the TSO's websites.
  2. In "convert_data_format.py" we save the data with corrected timestamp formats. Missing data is marked as NaN (processing step (1) in the supplementary material of [1]).
  3. In "clean_corrupted_data.py" we load the converted data and identify corrupted recordings. We mark them as NaN and clean some of the resulting data holes (processing step (2) in the supplementary material of [1]).

The python scripts run with Python 3.7 and with the packages found in "requirements.txt".

B) Data_converted and Data_cleansed
The folder "Data_converted" contains the output of "convert_data_format.py" and "Data_cleansed" contains the output of "clean_corrupted_data.py".

  • File type: The files are zipped csv-files, where each file comprises one year.
  • Data format: The files contain two columns. The first one represents the time stamps in the format Year-Month-Day Hour-Minute-Second, which is given as naive local time. The second column contains the frequency values in Hz.
  • NaN representation: We mark corrupted and missing data as "NaN" in the csv-files.

Use cases
We point out that this repository can be used in two different was:

  • Use pre-processed data: You can directly use the converted or the cleansed data. Note however that both data sets include segments of NaN-values due to missing and corrupted recordings. Only a very small part of the NaN-values were eliminated in the cleansed data to not manipulate the data too much. If your application cannot deal with NaNs, you could build upon the following commands to select the longest interval of valid data from the cleansed data:
from helper_functions import *
import pandas as pd

cleansed_data = pd.read_csv('/Path_to_cleansed_data/data.zip',
                        index_col=0, header=None, squeeze=True,
                        parse_dates=[0])
valid_bounds, valid_sizes = true_intervals(~cleansed_data.isnull())
start,end= valid_bounds[ np.argmax(valid_sizes) ]
data_without_nan = cleansed_data.iloc[start:end]
  • Produce your own cleansed data: Depending on your application, you might want to cleanse the data in a custom way. You can easily add your custom cleansing procedure in "clean_corrupted_data.py" and then produce cleansed data from the raw data in "Data_converted".

License
We release the code in the folder "Scripts" under the MIT license [8]. In the case of Nationalgrid and Fingrid, we further release the pre-processed data in the folder "Data_converted" and "Data_cleansed" under the CC-BY 4.0 license [7]. TransnetBW originally did not publish their data under an open license. We have explicitly received the permission to publish the pre-processed version from TransnetBW. However, we cannot publish our pre-processed version under an open license due to the missing license of the original TransnetBW data.

Notes

We thank Mark Thiele for fruitful discussions. Furthermore, we gratefully acknowledge support from the German Federal Ministry of Education and Research (BMBF grant no. 03EK3055B) and the Helmholtz Association (via the the "Helmholtz School for Data Science in Life, Earth and Energy" (HDS-LEE), the joint initiative "Energy System 2050 - A Contribution of the Research Field Energy" and via the grant No. VH-NG-1025). This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 840825.

Files

Data_cleansed.zip

Files (3.8 GB)

Name Size Download all
md5:cfb08092728bdd0b24b1b71e590c4f3b
1.9 GB Preview Download
md5:fc1c8bb5df52a3ae386f97f7dd63d121
1.9 GB Preview Download
md5:017618f73e0d5605028f853234dad8ff
5.9 kB Preview Download
md5:e25cf364f806a3f03d201f2ccfa1bb45
8.6 kB Preview Download

Additional details

References