Short-term solar data stream of 23-24 cycle
Authors/Creators
- 1. School of Technology, University of Campinas, Brazil
Description
These datasets contain records of daily solar data as well as data collected from magnetic classes.
The datasets were assembled with data from ftp://ftp.swpc.noaa.gov/pub/warehouse/.
The date the data was assembled is 2017-01-15 (yyyy-mm-dd).
The original data source is provided by the Space Weather Prediction Center - SWPC, which is linked to the National Oceanic and Atmospheric Administration - NOAA from US Department of Commerce.
The data collected refer to the period between january 01, 1997 to january 15, 2017.
Features included:
- radio_flux_10.7cm: the solar radio flux at 10.7 cm (2800 MHz) is an indicator of solar activity. It is also called the F10.7 index and is one of the longest running records of solar activity. Radio emissions originate high in the chromosphere and low in the corona of the solar atmosphere.
- sesc_sunspot_number: it refers to the number of sunspots computed on a given day. Also called Wolf's number of sunspots, it is given by R = k(10g + s), where k is a scalable factor indicating the combined effects of observation conditions, g is the number of active regions and s the number of sunspots in all these groups.
- sunspot_area: it refers to the sum of the corrected area of all observed sunspots. It is measured in units of millionths of the solar hemisphere.
- goes15_xray_bkgd_flux: it corresponds to the daily average background X-ray flux that is measured by the SWPC primary GOES satellite. To calculate this value, sensors register 24 X-ray measures for a given day, one for each hour. Then, the SWPC creates 3 groups of periods of 8 hours. For these groups, the SWPC registers the lowest values of flux, creating 3 minimal values, one for each group. Then, they calculate the average between the minimum values of the first and the third group. After the average calculation, they must compare this value to the minimal value of the second group. The minimum value from the last comparison gives the result of the X-ray background flux.
- mwl_alpha: binary attribute indicating the existence of apha magnetic class in any observed spot.
- mwl_beta: binary attribute indicating the existence of beta magnetic class in any observed spot.
- mwl_gamma: binary attribute indicating the existence of gamma magnetic class in any observed spot.
- mwl_beta_gamma: binary attribute indicating the existence of beta-gamma magnetic class in any observed spot.
- mwl_beta_delta: binary attribute indicating the existence of beta-delta magnetic class in any observed spot.
- mwl_beta_gamma_delta: binary attribute indicating the existence of beta-gamma-delta magnetic class in any observed spot.
- mwl_gamma_delta: binary attribute indicating the existence of gamma-delta magnetic class in any observed spot.
- mwl_delta: binary attribute indicating the existence of delta magnetic class in any observed spot.
We performed missing data imputation using k-NN over all features. The k-NN used the Gower's distance as its distance coefficient. In addition, we also performed z-score standardization in all features.
We designed the data into a sliding time window stream. In other words, we designed the data stream regarding four days before a t1 instant (i.e. t5, t4, t3, and t2). Hence, new features were created considering the evolution of data along five days:
- radio_flux_10.7cm_[t5, t4, t3, t2, t1];
- sesc_sunspot_number_[t5, t4, t3, t2, t1];
- sunspot_area_[t5, t4, t3, t2, t1];
- goes15_xray_bkgd_flux_[t5, t4, t3, t2, t1];
- mwl_alpha_[t5, t4, t3, t2, t1];
- mwl_beta_[t5, t4, t3, t2, t1];
- mwl_gamma_[t5, t4, t3, t2, t1];
- mwl_beta_gamma_[t5, t4, t3, t2, t1];
- mwl_beta_delta_[t5, t4, t3, t2, t1];
- mwl_beta_gamma_delta_[t5, t4, t3, t2, t1];
- mwl_gamma-delta_[t5, t4, t3, t2, t1];
- mwl_delta_[t5, t4, t3, t2, t1].
We designed our target variable as being the occurrence of at least one flare phenomenon of M or X class in the next 24, 24-48, and 48-72 hours ahead of the t1 instant:
- flare_t1d: occurrence of at least one flare of class M or X in the next 24 hours ahead of the t1 instant;
- flare_t2d: occurrence of at least one flare of class M or X 24-48 hours ahead of the t1 instant;
- flare_t3d: occurrence of at least one flare of class M or X 48-72 hours ahead of the t1 instant.