Published March 21, 2022 | Version v1
Dataset Open

IRIS Multiple Instance Learning Dataset

Authors/Creators

  • 1. Institute for Data Science, FHNW Windisch

Description

This dataset contains the data for the paper 'Using Multiple Instance Learning for Explainable Solar Flare Prediction' (arxiv pre-print) . It comes as a compressed Python Numpy-File and contains the following variables:

Name Shape Description
data (10'000, 1100, 240) 10'000 Bags of zero-padded spectrograms
data_scaled (10'000, 1100, 240) Like data, but standard-scaled
masks (10'000, 1100) Masks that indicate where spectrograms have been zero-padded
groups (10'000,) Observation group the bag is assigned to
obs_ids (10'000,) Observation ID the bag is assigned to
obs_classes (10'000,) Observation class (AR/PF) the bag is assigned to
raster_pos (10'000,) Raster position number the bag was taken from (always 0 for sit-and-stare)
folds (10'000,) Validation fold for the particular observation group

 

To load the e.g. the variable 'data', use Python and Numpy:

import numpy as np
f = np.load("IRISMIL_dataset_10000_bags.npz", allow_pickle=True)
f['data']

 

Files

Files (9.4 GB)

Name Size Download all
md5:118f1d16760952436b3cd77228b972c8
9.4 GB Download