Parameters of 220 million stars from Gaia BP/RP spectra
=======================================================
by Xiangu Zhang (张翔宇), Gregory M. Green, Hans-Walter Rix
of the Max Planck Institute for Astronomy, Heidelberg
We present parameters of 220 million stars, based on Gaia XP spectra and near-infrared photometry from 2MASS and WISE.
Instead of using *ab initio* stellar models, we develop a data-driven model of Gaia XP spectra as a function of the stellar parameters, with a few straightforward built-in physical assumptions.
For full details of our method, see our corresponding [paper](https://ui.adsabs.harvard.edu/abs/2023arXiv230303420Z/abstract).
Here, we provide the resulting catalog of stellar parameters, our trained stellar model, and a few Python scripts that demonstrate how to interact with the catalog and model.
Catalog description:
--------------------
Our stellar parameter catalog is contained in `stellar_params_catalog_*.h5`, and contains the following columns:
* `gdr3_source_id` (integer): Gaia DR3 `source_id`.
* `ra` (float): Right Ascension (in deg), as measured by Gaia DR3.
* `dec` (float): Declination (in deg), as measured by Gaia DR3.
* `stellar_params_est` (5 floats): Estimates of stellar parameters (effective temperature in kiloKelvin, [Fe/H] in dex, log(g) in dex, E in mag, parallax in mas).
* `stellar_params_err` (5 floats): Uncertainties in the stellar parameters.
* `chi2_opt` (float): chi^2 of the best-fit solution.
* `ln_prior` (float): Natural log of the GMM prior on stellar type, at the location of the optimal solution.
* `teff_confidence` (float): A neural-network-based estimate of the confidence in the effective temperature estimate, on a scale of 0 (no confidence) to 1 (high confidence).
* `feh_confidence` (float): As teff_confidence, but for [Fe/H].
* `logg_confidence` (float): As teff_confidence, but for log(g).
* `quality_flags` (8-bit uint): The three least significant bits represent whether the confidence in effective temperature, [Fe/H] and log(g) is less than 0.5, respectively. The 4th bit is set if `chi2_opt/61 > 2`. The 5th bit is set if `ln_prior < -7.43`. The 6th bit is set if our parallax estimate is more than 10 sigma from the GDR3 measurement (using reported parallax uncertainties from GDR3). The two most significant bits are always unset. We recommend a cut of `quality_flags < 8` (the "basic reliability cut"), although a stricter cut of `quality_flags == 0` ensures higher reliability at the cost of lower completeness.
Each column is stored in a separate dataset in the HDF5 file. All datasets have identical ordering, so that, for example, star 1003 is found in element of 1003 of each dataset.
The files `stellar_covariances_catalog_*.h5` contain information about the full covariance matrices of our parameter estimates:
* `stellar_params_icov_triu` (15 floats): Upper triangle of the inverse covariance matrix of our stellar parameters.
* `stellar_params_cov_triu` (15 floats): Upper triangle of the covariance matrix of our stellar parameters, obtained from the inverse covariance matrix in a numerically stable manner that ensures positive semi-definiteness.
The file `extinction_curve.txt` contains the extinction curve (float) at corresponding wavelength (float, in nm). The extinction value (in mag) is given by multiplying the extinction curve and `E` in `stellar_params_catalog_*.h5`.
See our corresponding [paper](https://ui.adsabs.harvard.edu/abs/2023arXiv230303420Z/abstract) for more details on these parameter estimates, quality flags, etc.
Stellar model:
--------------
The stellar model is contained in `stellar_flux_model.tar.gz`, and is stored in the [Tensorflow SavedModel format](https://www.tensorflow.org/guide/saved_model). Before running the scripts below, you must untar the model. On a Unix-like system (such as Linux or MacOSX), run:
tar -xzf stellar_flux_model.tar.gz
The script `example_model_plots.py` shows how to load the model (one line of Python code) and interact with it.
Example scripts:
----------------
We provide a few example Python scripts, to help familiarize the user our catalog and stellar model:
* `example_catalog_plots.py`: Loads a small subset of the catalog and generates a few plots.
* `example_model_plots.py`: Loads our model of stellar flux, and generates plots of the stellar spectra (as a function of stellar parameters) and of the extinction curve.
* `convert_to_fits.py`: Converts `stellar_params_catalog_*.h5` to FITS. Careful! This script may be very memory intensive!
* `reconstruct_cov_icov.py`: Shows how to reconstruct the covariance and inverse covariance matrices from their upper triangles.