Galaxy Zoo DESI: Detailed Morphology Classifications for 8.7M Galaxies in the DESI Legacy Imaging Surveys
Creators
- 1. University of Manchester
- 2. University of Oxford,
- 3. European Space Astronomy Centre
- 4. University of Oxford
- 5. Haverford College
- 6. South African Radio Astronomy Observatory (SARAO)
- 7. The Open University
- 8. University of Minnesota
- 9. Lancaster University
- 10. Zooniverse
Description
This repository contains the data released in the paper "Galaxy Zoo DESI: Detailed Morphology Classifications for 8.7M Galaxies in the DESI Legacy Imaging Surveys" (DOI to follow on publication).
We release detailed morphology measurements for bright (r < 19) galaxies in the DESI Legacy Imaging Surveys footprint. These measurements estimate the presence of bars, spirals arms, ongoing mergers, and more.
---
GZ DESI Detailed Morphology Catalogs
These catalogs are created by training deep learning models on Galaxy Zoo volunteer responses, to predict what volunteers might say for new galaxies. The models are available at [www.github.com/mwalmsley/zoobot](www.github.com/mwalmsley/zoobot). Our measurements are predicted vote fractions i.e. the fraction of volunteers expected to select a given answer for a given question.
We share two catalog versions containing the same morphology measurements but presented in different ways.
gz_desi_deep_learning_catalog_friendly.parquet contains the morphology measurements
gz_desi_deep_learning_catalog_advanced.parquet contains the same measurements, and additional information:
- _friendly includes only relevant vote fractions, defined as vote fractions to answers of questions that a majority of volunteers would have been asked. This removes predicted vote fractions for e.g. the fraction of volunteers answering "2 spiral arms" to a galaxy with no spiral arms. _advanced includes all vote fractions and instead reports the (column "proportion_asked"). The user must select which vote fractions they consider relevant (we suggest proportion_asked > 0.5, which recovers the _friendly fractions).
- _advanced includes columns with estimated credible intervals (error bars) around each vote fraction. These are calculated from the vote fraction posterior predicted by our models.
Finally, we separately present volunteer votes collected for 96k galaxies during the GZD-8 campaign, i.e. after the release of GZ DECaLS but before this (GZ DESI) release. These are split into the _core and _extended catalogs, where _extended includes galaxies which received five or more votes for "artifact". The models above were trained on these votes as well as votes from GZ DECaLS.
---
External Catalog
For convenience, we also include an additional catalog of non-morphology measurements created by other authors (external_catalog.parquet) cross-matched to our morphology catalogs. Please credit those authors if you use this catalog (references are in the GZ DESI paper).
A particularly important external measurement is redshift. Morphology is increasingly hard to resolve at higher redshift and so distant galaxies appear less featured. external_catalog.parquet includes the column "redshift", which is the SDSS spectroscopic redshift where available and a photometric redshift estimate otherwise (again, see the GZ DESI paper for references and credit). You may want to select only galaxies at lower redshifts.
---
Data Notes
Parquet is a fast csv-like format which can be read with pd.read_parquet(loc, columns=[some columns]). Parquet files are read column-by-column (rather than row-by-row) and so you can chose which columns to load. You can easily check which columns are available using columns=['foo'] and reading the error message. We suggest loading only the columns you need when working with the larger catalogs. This will require much less memory than loading every column.
We will release updates if needed via Zenodo versioning. We recommend using the latest version of this repository. You can check the version you are currently viewing on the right-hand sidebar.
Please cite the paper (DOI to follow on publication) when using the data in this repository.
---
History
v0.0.1 - closed pre-release for internal review
v1.0.0 - draft public release. Removed low-z pre-filtered catalogs.
v1.0.1 - first public release. Added .csv version of _friendly catalog. Tweaked catalog formatting for clarity and consistency.
Files
gz_desi_deep_learning_catalog_friendly.csv
Files
(11.5 GB)
Name | Size | Download all |
---|---|---|
md5:1d8ed3b5660487b5bae58cfee3e4cf0e
|
1.6 GB | Download |
md5:dd6924154b16078fc10c799f9308cd63
|
7.6 GB | Download |
md5:12976d7eb99e86d83289d9b0c037d954
|
1.6 GB | Preview Download |
md5:114785d00c4d4f2208185bee73dd08b8
|
658.8 MB | Download |
md5:ba79d9d9b56bb81608fbaa4c48d4f465
|
6.4 MB | Download |
md5:d632e4830af7eae2e9bb0a3f88b26992
|
5.5 MB | Download |