Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.

There is a newer version of the record available.

Published March 30, 2023 | Version 1.0.0
Dataset Open

Radio Galaxy Zoo: #Tagging Radio Subjects using Text

  • 1. Australian National University
  • 2. University of Western Australia
  • 3. Google Australia
  • 4. CSIRO Space & Astronomy
  • 5. Data61, CSIRO

Description

RadioTalk is a platform that enabled citizen scientists of the Radio Galaxy Zoo (RGZ) project to provide additional descriptions of the radio subjects they were observing. This dataset contains a wealth of auxiliary information in the form of tags and comments which are especially valuable for extended radio sources. Our work is the first to explore this dataset, and for the first time, we combine text and image features to automatically classify radio galaxies using a machine learning approach. Text annotations are rare but valuable sources of information for the classification of astronomical sources.

Notes

data_train.zip: compressed csv file for training data. data_val.zip: compressed csv file for validation data. data_test.zip: compressed csv file for test data. Each row in the csv file corresponds to a radio subject. Columns corresponds to features and tags: radio image features (column radio001 to column radio768), infrared image features (column ir001 to column ir768), RadioTalk discussion text features (column text001 to column text768), and boolean indicator of tags (the last 11 columns).

Files

data_test.zip

Files (114.0 MB)

Name Size Download all
md5:8e1a9dad7e8cd5429d4f63d4683f51b8
17.2 MB Preview Download
md5:87fb17a616906c769e7376127ae3b6e5
82.3 MB Preview Download
md5:36d4068ae41273d56bac721fe6f5b1d6
14.6 MB Preview Download