Published May 16, 2022 | Version v1
Presentation Open

Assessing the quality of massive spectroscopic surveys with unsupervised machine learning

  • 1. Universidad de Los Andes

Description

Massive spectroscopic surveys targeting tens of millions of stars and galaxies are starting to dominate the
observational landscape in the 2020 decade. For instance, a night of observation with the Dark Energy
Spectroscopic Instrument (DESI) can measure on the order of 100k spectra, each spectrum sampled over 2k
wavelength points, approximately. Assessing the quality of such a massive data flow requires new
approaches to complement visual inspection by humans. In this work, we explore the Uniform Manifold
Approximation and Projection (UMAP) as a technique to assess the data quality of DESI. We use UMAP to
project DESI nightly data into a 2-dimensional space. Sometimes in this space, we are able to find a small
number of outliers. After visual inspection of those outliers, we find that they correspond to instrument
fluctuations that can be then fully diagnosed by inspecting the raw data, allowing the development of an
appropriate solution through data re-processing. These results pave the way for to use of machine learning
techniques to automatically monitor the health of massive spectroscopic surveys.

Files

Assessing the quality of massive spectroscopic surveys with unsupervised machine learning.pdf