Published December 29, 2020 | Version 0.0.2
Dataset Open

Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from Volunteers and Deep Learning for 314,000 Galaxies

  • 1. University of Oxford
  • 2. European Space Agency
  • 3. University of Portsmouth
  • 4. University of Minnesota
  • 5. University of Nottingham
  • 6. Princeton University
  • 7. University of Alabama
  • 8. Haverford College
  • 9. Lancaster University
  • 10. Zooniverse

Description

This repository contains the data released in the paper "Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from Volunteers and Deep Learning for 314,000 Galaxies" (DOI to follow on publication).

We release detailed morphology catalogues, both volunteer and automated, for Galaxy Zoo DECaLS.

- gz_decals_volunteers_1_and_2 contains volunteer classifications for galaxies classified during the GZD-1 and GZD-2 campaigns.

- gz_decals_volunteers_5 similarly contains classifications from the GZD-5 campaign. Note that GZD-5 used a modified schema designed to better detect mergers and weak bars, and includes many galaxies with only approx. five volunteer responses.

- gz_decals_auto_posteriors contains the predicted posteriors for volunteer responses to all galaxies used in any campaign. The full posteriors are recorded as Dirichlet distribution concentrations. gz_decals_auto_posteriors also summarises these posteriors as the automated equivalent of previous Galaxy Zoo data releases; the expected vote fractions (mean posteriors). Note that not all posteriors/vote fractions are relevant for every galaxy; we suggest assessing relevance using the estimated fraction of volunteers that would have been asked each question.

We include a schema document, schema.md, to define the column names in each catalogue.

We also release the galaxy images shown to volunteers on www.galaxyzoo.org during GZD-5. The images on which the automated classifier was trained may be derived from these volunteer-facing images. These images are split into four zip files, each of which contains images named by iauname inside a subfolder named by the first four characters in their iauname. Not all images were labelled during GZD-5 - refer to the catalog for training labels. We are working with the Zenodo team to add these large files to this repository - meanwhile, you can download them from The University of Manchester here.

The .csv and .parquet files contain identical data. Parquet is a fast column-oriented binary format which can be read with pd.read_parquet(loc, columns=[some columns]).

You may also be interested in the github repository which contains code to reproduce the model and to fine-tune it for new tasks (including pretrained weights).

We will release updates if needed via Zenodo versioning. We recommend using the latest version of this repository. You can check the version you are currently viewing on the right-hand sidebar.

Please cite the paper (DOI to follow on publication) when using the data in this repository.

---

History

v0.0.1 (submission) provides the catalog files.

v0.0.2 (first revision) renames the catalog files, adds flags for poorly sized galaxies, and includes the galaxy images via the University of Manchester

Files

gz_decals_auto_posteriors.csv

Files (108.2 GB)

Name Size Download all
md5:5cc06cc0e2d44b5c0eb5c60231530b67
2.6 GB Preview Download
md5:2e9f4b4fe9f3473a44f60aed0526911f
1.6 GB Download
md5:f80e2a847cc7b1b3b6015e5214e368da
27.8 GB Preview Download
md5:9c21464506c69acc35d3d5331786d361
22.3 GB Preview Download
md5:24b9f3bafe23f5aa8e6c0ae0f60c0b50
32.5 GB Preview Download
md5:30d2662d34590e2786721ecb738fad8a
21.2 GB Preview Download
md5:f1be080ac22269fb5ea4e12ddefb8b11
80.8 MB Preview Download
md5:e9de511cbd5977e02c81cbdb5d21b7c2
18.7 MB Download
md5:b12e3767b3968f9767d4f4115cc69d4d
145.8 MB Preview Download
md5:364d0b598b7d2958553350c5ce0ffc14
40.5 MB Download
md5:d23668017178201416730e6e7c0e768d
4.8 kB Preview Download