Machine learning for Gravity Spy: Glitch classification and dataset
Creators
- 1. Electrical Engineering and Computer Science, Northwestern University, Evanston, IL, USA
- 2. Department of Computer Science, University of Illinois at Chicago, IL, USA
- 3. Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA), Northwestern University, Evanston, IL, USA
- 4. Department of Physics, California State University Fullerton, Fullerton, CA, USA
Description
We present the first version of the training set used in the Gravity Spy citizen science project. This training set, discussed in detail here, was utilized to train the convolutional neural network employed in the Gravity Spy project. We anticipate moving forward to release more labelled Gravity Spy data sets, including a refined version of this training set which can be found here 10.5281/zenodo.1476551, and data sets containing the annotations provided by our citizen science volunteers.
Data Set Information
There are three files provided in this data set
- trainingset_v1d0_metadata.csv
- This file has three columns, gravityspy_id, label, and sample_type. gravityspy_id is the unique 10 character hash given to every Gravity Spy sample. label is the string label of the sample. sample_type indicates whether this sample was used in the paper for testing training or validating the models. This is provided for those who would like to do direct comparisons to the network described in the paper.
- trainingsetv1d0.h5
- This file contains the exact arrays used in the paper for every Gravity Spy sample. Each Gravity Spy sample is defined by four different images with varying temporal duration, 0.5, 1.0, 2.0, and 4.0 second, respectively. This also determines the naming conventions of the PNGs: interferometer_gravityspyid_spectrogram_duration.png (e.g. H1_Fv3p6eROvA_spectrogram_0.5.png, H1_Fv3p6eROvA_spectrogram_1.0.png, H1_Fv3p6eROvA_spectrogram_2.0.png, H1_Fv3p6eROvA_spectrogram_4.0.png).
- This file contains all the information needed for each sample in the Gravity Spy dataset (i.e. the label, the sample type of the sample, the unique id of the sample, and the image data for that sample.
- /1080Lines/validation/xUEyaWr34c Group
/1080Lines/validation/xUEyaWr34c/0.5.png Dataset {1, 140, 170}
/1080Lines/validation/xUEyaWr34c/1.0.png Dataset {1, 140, 170}
/1080Lines/validation/xUEyaWr34c/2.0.png Dataset {1, 140, 170}
/1080Lines/validation/xUEyaWr34c/4.0.png Dataset {1, 140, 170}
- /1080Lines/validation/xUEyaWr34c Group
- trainingsetv1d0.tar.gz
- Contains the raw PNGs of the Gravity Spy training set.
- The structure of the folder is /"label"/"sample_type"/"pngs"
Data Set Parsing Information
To read and crop out the plot axis and labels of the provided PNGs, the following small python code using scikit-image should work.
from skimage import io
image_data = io.imread("filename_of_image")
x=[66, 532]; y=[105, 671]
image_data = image_data[x[0]:x[1], y[0]:y[1], :3]
Files
trainingset_v1d0_metadata.csv
Files
(12.9 GB)
Name | Size | Download all |
---|---|---|
md5:6b99210289d508696fb3feccfa7c79fc
|
248.9 kB | Preview Download |
md5:7055ef3e0b700c117f8fad7489d6da6b
|
3.3 GB | Download |
md5:efb0d0d3f3750375d00d0d2e99c9e352
|
9.6 GB | Download |
Additional details
Funding
- INSPIRE: Teaming Citizen Science with Machine Learning to Deepen LIGO's View of the Cosmos 1547880
- National Science Foundation