Dataset Open Access

A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods

Carreira Pedro, Hugo; Larson, David; Coimbra, Carlos

Description
This repository contains a comprehensive solar irradiance, imaging, and forecasting dataset. 
The goal with this release is to provide standardized solar and meteorological datasets to the research community for the accelerated development and benchmarking of forecasting methods. 
The data consist of three years (2014–2016) of quality-controlled, 1-min resolution global horizontal irradiance and direct normal irradiance ground measurements in California. 
In addition, we provide overlapping data from commonly used exogenous variables, including sky images, satellite imagery, Numerical Weather Prediction forecasts, and weather data. 
We also include sample codes of baseline models for benchmarking of more elaborated models.

Data usage
The usage of the datasets and sample codes presented here is intended for research and development purposes only and implies explicit reference to the paper:
Pedro, H.T.C., Larson, D.P., Coimbra, C.F.M., 2019. A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods. Journal of Renewable and Sustainable Energy 11, 036102. https://doi.org/10.1063/1.5094494

Although every effort was made to ensure the quality of the data, no guarantees or liabilities are implied by the authors or publishers of the data.

Sample code
As part of the data release, we are also including the sample code written in Python 3. 
The preprocessed data used in the scripts are also provided. 
The code can be used to reproduce the results presented in this work and as a starting point for future studies. 
Besides the standard scientific Python packages (numpy, scipy, and matplotlib), the code depends on pandas for time-series operations, pvlib for common solar-related tasks, and scikit-learn for Machine Learning models. 
All required Python packages are readily available on Mac, Linux, and Windows and can be installed via, e.g., pip. 

Units
All time stamps are in UTC (YYYY-MM-DD HH:MM:SS).
All irradiance and weather data are in SI units.
Sky image features are derived from 8-bit RGB (256 color levels) data.
Satellite images are derived from 8-bit gray-scale (256 color levels) data.

Missing data
The string "NAN" indicates missing data

File formats
All time series data files as in CSV (comma separated values)
Images are given in tar.bz2 files

Files 

  • Folsom_irradiance.csv                           Primary       One-minute GHI, DNI, and DHI data.
  • Folsom_weather.csv                              Primary       One-minute weather data.
  • Folsom_sky_images_{YEAR}.tar.bz2    Primary       Tar archives with daytime sky images captured at 1-min intervals for the years 2014, 2015, and 2016, compressed with bz2.
  • Folsom_NAM_lat{LAT}_lon{LON}.csv    Primary       NAM forecasts for the four nodes nearest the target location. {LAT} and {LON} are replaced by the node’s coordinates listed in Table I in the paper. 
  • Folsom_sky_image_features.csv           Secondary    Features derived from the sky images.
  • Folsom_satellite.csv                               Secondary   10 pixel by 10 pixel GOES-15 images centered in the target location. 
  • Irradiance_features_{horizon}.csv          Secondary   Irradiance features for the different forecasting horizons ({horizon} 1⁄4 {intra-hour, intra-day, day-ahead}). 
  • Sky_image_features_intra-hour.csv       Secondary   Sky image features for the intra-hour forecasting issuing times. 
  • Sat_image_features_intra-day.csv         Secondary   Satellite image features for the intra-day forecasting issuing times. 
  • NAM_nearest_node_day-ahead.csv      Secondary   NAM forecasts (GHI, DNI computed with the DISC algorithm, and total cloud cover) for the nearest node to the target location prepared for day-ahead forecasting.
  • Target_{horizon}.csv                              Secondary   Target data for the different forecasting horizons.
  • Forecast_{horizon}.py                            Code            Python script used to create the forecasts for the different horizons. 
  • Postprocess.py                                      Code             Python script used to compute the error metric for all the forecasts.

 

Files (49.8 GB)
Name Size
Folsom_irradiance.csv
md5:f7deba7ccd089dbd3f52a46405a7dfc2
76.5 MB Download
Folsom_NAM_lat38.579454_lon-121.260320.csv
md5:3d917eeecdf967d1f90f803fad5e5467
1.6 MB Download
Folsom_NAM_lat38.599891_lon-121.126680.csv
md5:30024faae0123990cf29c81c281eaccc
1.6 MB Download
Folsom_NAM_lat38.683880_lon-121.286556.csv
md5:c0d6db7093b957603cb05c90fff23167
1.6 MB Download
Folsom_NAM_lat38.704328_lon-121.152788.csv
md5:792f830c261e2c041d35ebeb6eadbeac
1.6 MB Download
Folsom_satellite.csv
md5:f68086048ee5d764d1d992404147c421
15.7 MB Download
Folsom_sky_image_features.csv
md5:86d58b6b84393399735a93ce1657cfab
104.7 MB Download
Folsom_sky_images_2014.tar.bz2
md5:fb2dee79429725ac91df539b310a9f98
13.8 GB Download
Folsom_sky_images_2015.tar.bz2
md5:bce043f846a4dd01668a32943578b652
16.9 GB Download
Folsom_sky_images_2016.tar.bz2
md5:af72cd28b398fb531ae1ab877c19eba0
18.6 GB Download
Folsom_weather.csv
md5:b04e0dc7edf3513a769ea2c8c59beb27
138.8 MB Download
Forecast_day-ahead.py
md5:763f1666ff1485d631b7417cc8c4a5e8
5.1 kB Download
Forecast_intra-day.py
md5:6030752b33ce675859d131833a5e127d
5.1 kB Download
Forecast_intra-hour.py
md5:7dd387b298e4c75f84a5fe7093bde2dd
5.1 kB Download
Irradiance_features_day-ahead.csv
md5:889efab48e0c0c690c45b11e641ba388
725.6 kB Download
Irradiance_features_intra-day.csv
md5:971eee5f86677536b6238e73d923cedc
8.3 MB Download
Irradiance_features_intra-hour.csv
md5:9e25e78b816e51b95d4349f304155f56
49.6 MB Download
NAM_nearest_node_day-ahead.csv
md5:978905d0c0d1b1488325b33456446d23
519.3 kB Download
Postprocess.py
md5:73601ae78e2e49942673688650abfa3d
4.8 kB Download
Sat_image_features_intra-day.csv
md5:8af401d02a090108b1863cb953ef64cf
20.8 MB Download
Sky_image_features_intra-hour.csv
md5:a81c753c308213e2b506b94e0412403a
23.6 MB Download
Target_day-ahead.csv
md5:ed4959b21d282177cedcefe2e8e27f83
1.2 MB Download
Target_intra-day.csv
md5:9d530ea7cbe0f122bc26041e9da74afd
10.7 MB Download
Target_intra-hour.csv
md5:ac6ebc385b6f6112c68ea967fc437c69
64.5 MB Download
  • Pedro, H.T.C., Larson, D.P., Coimbra, C.F.M., 2019. A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods. Journal of Renewable and Sustainable Energy 11, 036102. https://doi.org/10.1063/1.5094494

817
18,440
views
downloads
All versions This version
Views 817817
Downloads 18,44018,440
Data volume 275.4 TB275.4 TB
Unique views 741741
Unique downloads 2,9502,950

Share

Cite as