Data for QZO: A Catalog of 5 Million Quasars from the Zwicky Transient Facility
Creators
- 1. Division of Physics, Mathematics and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA
- 2. Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109
- 3. DIRAC Institute, Department of Astronomy, University of Washington, Seattle, WA 98195, USA
- 4. IPAC, California Institute of Technology, Pasadena, CA 91125, USA
- 5. Caltech Optical Observatories, California Institute of Technology, Pasadena, CA 91125, USA
- 6. Center for Data Driven Discovery, California Institute of Technology, Pasadena, CA 91125, USA
- 7. Department of Physics, Drexel University, Philadelphia, PA 19104, USA
Description
QZO.csv
The QZO catalog, which includes 4,849,574 objects and columns as described below, excluding the duplicate objects flag. The classifications are based on XGB models trained on ZTF g-band median magnitude and light curves classification with transformer model, as well as WISE W[1-4] magnitudes and colors. The photo-zs are based on ZTF g-band magnitude and WISE magnitudes and colors. We remove duplicated ZTF light curves by removing objects which within the full ZTF catalog have at least one neighbour within 1 arcsec with more ZTF observation epochs. The final number of quasars was achieved with magnitude, number of observation epochs, and minimum quasar classification probability cuts, such that g < n_obs / 80 + 20.375, where n_obs is the number of ZTF observational epochs per light curve, and p_(QSO) > 0.9, where p_(QSO) is XGB classification probability for the QSO class. The photo-zs are available for 35% of these objects, depending on the availability of WISE observations.
ZTF_all_QSO.csv
This file provides all the columns for 78,078,450 objects classified as QSOs by at least one of the two XGB models with and without the WISE features. There are no cuts applied, and there are no duplicates removed. 26% of objects are marked with the duplicates flag.
train.csv
The train data predictions. This file contains 2,588,221 records, with ZTF ID and duplicates flag missing. Selecting the longest ZTF light curve for each non duplicated SDSS object removed ZTF duplicates.
Catalog columns
ID ZTF identifier
ra right ascension
dec declination
n_obs number of ZTF observation epochs
is_duplicate flag indicating duplicated light curves
mag_median ZTF g-band median magnitude
p_[galaxy, QSO, star] classification probabilities
p_WISE_[galaxy, QSO, star] classifications with added WISE data
redshift redshift estimate
ANN_clf.[data-00000-of-00001, index]
ANN model for classification of ZTF g-band light curves. ANN model is trained on ZTF g-band data with at least 20 observation epochs per light curve. It does not require scaling of input light curves, which is done separately for each light curve as part of the transformer model. An example on how to load and use the ANN can be found in the script “run_inference.py” in the GitHub repository.
XGB_clf__ZTF_[PS, WISE, GAIA, PS_WISE, PS_GAIA, WISE_GAIA, PS_WISE_GAIA].pickle
XGB_z__ZTF_WISE.pickle
XGB classification and redshift models for different combinations of input surveys. XGB classification models are trained on all ZTF data with available ANN classification, learning to classify missing features. The XGB redshift model does not include ANN classification as features. An example on how to load and use XGB models can be found in the script “run_inference_XGB.py” in the GitHub repository.
Features order
ZTF g_mag_median, p_ANN_galaxy, p_ANN_QSO, p_ANN_star
PS g, r, i, z, g - r, g - i, g - z, r - i, r - z, i - z
WISE W1, W2, W3, W4, W1 - W2, W1 - W3, W1 - W4, W2 - W3, W2 - W4, W3 - W4
GAIA g_mean_mag, parallax, pmra, pmdec, bp_mean_mag, rp_mean_mag, bp_rp_excess_factor
The exact column names can be found in the script “features.py” in the Github repository.
Files
QZO.csv
Files
(11.9 GB)
Name | Size | Download all |
---|---|---|
md5:0ebd913517a990e5603f4ba4edf88772
|
19.0 MB | Download |
md5:01b35b8d41c89461c354ac8cdae34bda
|
9.6 kB | Download |
md5:18c1faaedf0e17e6101967e7a66952bf
|
19.0 MB | Download |
md5:1a92f82ec1b6073a8121cc452a738fa6
|
9.6 kB | Download |
md5:7f39e62d4beda90ecead2f1c008b118c
|
672.0 MB | Preview Download |
md5:b87d9e64e9a2bab022ab3ddddc1ba4d8
|
301.1 MB | Preview Download |
md5:c2709406957f055ffdc540b456d6855e
|
7.7 MB | Download |
md5:987df660e7bddbed9fecade7f0ae0754
|
13.1 MB | Download |
md5:82d98745b0e86fd066fc4c7fa8406b02
|
29.2 MB | Download |
md5:f5ae56bd03e1e8e17a07e8e653de5d5b
|
19.9 MB | Download |
md5:397875271874e78736377d634b8b6d0d
|
22.7 MB | Download |
md5:27adaf52b13210d57309f72185a46197
|
17.9 MB | Download |
md5:0a01540d84f7df67dd05b66e0a8c943d
|
12.5 MB | Download |
md5:c1a58b2abf365ef4d91e25def0af0d88
|
12.0 MB | Download |
md5:1ec4394f62b4c648708bb41c5dc57175
|
16.5 MB | Download |
md5:a397505b4b4d38cc6b4847fb052989f4
|
10.8 GB | Preview Download |
Additional details
Funding
- U.S. National Science Foundation
- AST-2108402
Software
- Repository URL
- https://github.com/snakoneczny/ztf-agn
- Programming language
- Python