There is a newer version of the record available.

Published September 27, 2019 | Version v1
Dataset Open

Hierarchical Demand Forecasting Benchmark for the Distribution Grid

  • 1. Nespoli
  • 2. Medici
  • 3. Lopatichki
  • 4. Sossan


This dataset contains power measurements and meteorological forecasts relative to a set of 24 power meters located in Rolle (Switzerland). These datasets are published to provide a standard benchmark for evaluating forecasting algorithms for demand side management applications.

In L.Nespoli, V. Medici, K. Lopatichki, F. Sossan, Hierarchical Demand Forecasting Benchmark forthe Distribution Grid, arXiv, 2019, this dataset is used to test several regressors in predicting the 24 hours ahead electrical load.

This dataset consists of measurements coming from 62 IEC 61000-4-30 Class A power quality meters manufactured by DEPsys (Switzerland) installed in secondary substations and LV cabinets of the distribution grid of the city of Rolle (Switzerland). The dataset has been enriched with numerical weather predictions from commercial provider Meteoblue (Switzerland), updated every 12 hours. 

The power measurements are provided as a pickle dataset, which includes:

For each phase:

  • mean active and reactive power
  • voltage magnitude
  • maximum total harmonic distortion (THD)
  • voltage frequency \(\omega\)
  • the average power over the three phases.

The latter one has been used as target variable in the aforementioned paper.

The meteorological forecasts are provided as a Hierarchical Data Format 5 file, which includes:

  • temperature
  • global horizontal and normal irradiance (GHI and GNI, respectively)
  • relative humidity (RH)
  • pressure
  • wind speed and direction.

How to read the files with Python

The following code allows to open the files in python

import pandas as pd
import pickle as pk

nwp_data = pd.read_hdf('nwp_data.h5','df')
power_data = pk.load(open("power_data.p", "rb"))

The nwp_data is a pandas DataFrame of arrays. Each column, whose name is self explanatory, represent a set of 24 hours forecasted meteorological variable. These represents the most recent forecasts available from the NWP service at the respective time index of the dataset.

The power_data is a dict of pandas DataFrame. The each value of the dict, whose key is self explanatory, contains a DataFrame whose columns are the name of the meter they refers to. The DataFrame 'P_mean' additionally contains 6 fictitious aggregations of the phase-mean power of the meters, 'S1', 'S2', 'S11', 'S12', 'S21', 'S22', and 'all', which represents the  sum of all the meters. The hierarchical structure of the aggregations is the following one:

    |     |
   S1     S2
  _|_     _|_
 |   |   |   |
S11 S12 S21  S22

S11 contains the first quarter of the time series presented in the dataset, while S12,S21,S22 contain the second, third and fourth quarter of the time series, respectively.

Additionally to this, the reference paper also considered the following vacation days:



This project is carried out within the frame of the Swiss Centre for Competence in Energy Research on the Future Swiss Electrical Infrastructure (SCCER-FURIES) with the financial support of the Swiss Innovation Agency (Innosuisse - SCCER program) and of the Swiss Federal Office of Energy with the project SI/501523.



Files (252.0 MB)

Name Size Download all
72.8 MB Download
179.1 MB Download

Additional details


  • L.Nespoli, V. Medici, K. Lopatichki, F. Sossan, Hierarchical Demand Forecasting Benchmark forthe Distribution Grid, arXiv, 2019