Dataset Open Access

A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods

Carreira Pedro, Hugo; Larson, David; Coimbra, Carlos


DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="DOI">10.5281/zenodo.2826939</identifier>
  <creators>
    <creator>
      <creatorName>Carreira Pedro, Hugo</creatorName>
      <givenName>Hugo</givenName>
      <familyName>Carreira Pedro</familyName>
      <affiliation>University of California San Diego</affiliation>
    </creator>
    <creator>
      <creatorName>Larson, David</creatorName>
      <givenName>David</givenName>
      <familyName>Larson</familyName>
      <affiliation>University of California San Diego</affiliation>
    </creator>
    <creator>
      <creatorName>Coimbra, Carlos</creatorName>
      <givenName>Carlos</givenName>
      <familyName>Coimbra</familyName>
      <affiliation>University of California San Diego</affiliation>
    </creator>
  </creators>
  <titles>
    <title>A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods</title>
  </titles>
  <publisher>Zenodo</publisher>
  <publicationYear>2019</publicationYear>
  <subjects>
    <subject>solar irradiance forecasting</subject>
    <subject>sky images</subject>
    <subject>satellite images</subject>
    <subject>numerical weather prediction</subject>
    <subject>forecast benchmarking</subject>
  </subjects>
  <dates>
    <date dateType="Issued">2019-06-24</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://zenodo.org/record/2826939</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsCompiledBy">10.1063/1.5094494</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.2826938</relatedIdentifier>
  </relatedIdentifiers>
  <version>V1</version>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;&lt;br&gt;
This repository contains a comprehensive solar irradiance, imaging, and forecasting dataset.&amp;nbsp;&lt;br&gt;
The goal with this release is to provide standardized solar and meteorological datasets to the research community for the accelerated development and benchmarking of forecasting methods.&amp;nbsp;&lt;br&gt;
The data consist of three years (2014&amp;ndash;2016) of quality-controlled, 1-min resolution global horizontal irradiance and direct normal irradiance ground measurements in California.&amp;nbsp;&lt;br&gt;
In addition, we provide overlapping data from commonly used exogenous variables, including sky images, satellite imagery, Numerical Weather Prediction forecasts, and weather data.&amp;nbsp;&lt;br&gt;
We also include sample codes of baseline models for benchmarking of more elaborated models.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data usage&lt;/strong&gt;&lt;br&gt;
The usage of the datasets and sample codes presented here is intended for research and development purposes only and implies explicit reference to the paper:&lt;br&gt;
&lt;em&gt;Pedro, H.T.C., Larson, D.P., Coimbra, C.F.M., 2019. A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods.&amp;nbsp;Journal of Renewable and Sustainable Energy 11, 036102. https://doi.org/10.1063/1.5094494&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Although every effort was made to ensure the quality of the data, no guarantees or liabilities are implied by the authors or publishers of the data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sample code&lt;/strong&gt;&lt;br&gt;
As part of the data release, we are also including the sample code written in Python 3.&amp;nbsp;&lt;br&gt;
The preprocessed data used in the scripts are also provided.&amp;nbsp;&lt;br&gt;
The code can be used to reproduce the results presented in this work and as a starting point for future studies.&amp;nbsp;&lt;br&gt;
Besides the standard scientific Python packages (numpy, scipy, and matplotlib), the code depends on pandas for time-series operations, pvlib for common solar-related tasks, and scikit-learn for Machine Learning models.&amp;nbsp;&lt;br&gt;
All required Python packages are readily available on Mac, Linux, and Windows and can be installed via, e.g., pip.&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Units&lt;/strong&gt;&lt;br&gt;
All time stamps are in UTC (YYYY-MM-DD HH:MM:SS).&lt;br&gt;
All irradiance and weather data are in SI units.&lt;br&gt;
Sky image features are derived from 8-bit RGB (256 color levels) data.&lt;br&gt;
Satellite images are derived from 8-bit gray-scale (256 color levels) data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Missing data&lt;/strong&gt;&lt;br&gt;
The string &amp;quot;NAN&amp;quot; indicates missing data&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File formats&lt;/strong&gt;&lt;br&gt;
All time series data files as in CSV (comma separated values)&lt;br&gt;
Images are given in tar.bz2 files&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Files&amp;nbsp;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;em&gt;Folsom_irradiance.csv&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Primary&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;One-minute GHI, DNI, and DHI data.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Folsom_weather.csv&amp;nbsp;&lt;/em&gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Primary&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;One-minute weather data.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Folsom_sky_images_{YEAR}.tar.bz2&lt;/em&gt; &amp;nbsp; &amp;nbsp;Primary&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Tar archives with daytime sky images captured at 1-min intervals for the years 2014, 2015, and 2016, compressed with bz2.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Folsom_NAM_lat{LAT}_lon{LON}.csv &lt;/em&gt;&amp;nbsp; &amp;nbsp;Primary&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;NAM forecasts for the four nodes nearest the target location. {LAT} and {LON} are replaced by the node&amp;rsquo;s coordinates listed in Table I in the paper.&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Folsom_sky_image_features.csv &lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Secondary&amp;nbsp; &amp;nbsp; Features derived from the sky images.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Folsom_satellite.csv &lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Secondary &amp;nbsp; 10 pixel by 10 pixel GOES-15 images centered in the target location.&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Irradiance_features_{horizon}.csv&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Secondary &amp;nbsp; Irradiance features for the different forecasting horizons ({horizon} 1&amp;frasl;4 {intra-hour, intra-day, day-ahead}).&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Sky_image_features_intra-hour.csv&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Secondary &amp;nbsp; Sky image features for the intra-hour forecasting issuing times.&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Sat_image_features_intra-day.csv&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Secondary &amp;nbsp; Satellite image features for the intra-day forecasting issuing times.&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;NAM_nearest_node_day-ahead.csv &lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Secondary &amp;nbsp; NAM forecasts (GHI, DNI computed with the DISC algorithm, and total cloud cover) for the nearest node to the target location prepared for day-ahead forecasting.&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Target_{horizon}.csv&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Secondary &amp;nbsp; Target data for the different forecasting horizons.&lt;/li&gt;
	&lt;li&gt;F&lt;em&gt;orecast_{horizon}.py &lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Code&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Python script used to create the forecasts for the different horizons.&amp;nbsp;&lt;/li&gt;
	&lt;li&gt;&lt;em&gt;Postprocess.py&lt;/em&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Code&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Python script used to compute the error metric for all the forecasts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
    <description descriptionType="Other">{"references": ["Pedro, H.T.C., Larson, D.P., Coimbra, C.F.M., 2019. A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods.  Journal of Renewable and Sustainable Energy 11, 036102. https://doi.org/10.1063/1.5094494"]}</description>
  </descriptions>
</resource>
602
16,837
views
downloads
All versions This version
Views 602602
Downloads 16,83716,837
Data volume 252.3 TB252.3 TB
Unique views 550550
Unique downloads 2,5782,578

Share

Cite as