Introducing AXS: A framework for large-scale analysis of astronomical data
Astronomy eXtensions for Spark, or AXS, is a framework for large-scale astronomical data analysis based on Apache Spark, cutting-edge open-source engine for processing large amounts of data. The ever expanding scale of today’s astronomical surveys demand scalable and stable tools to help extract scientific information from the resulting data sets. However, astronomical software support is lacking in this regard. AXS aims to fill this void by providing easy-to-use Spark-based APIs with features such as on-line cross-matching and spatial selection. AXS has so far been used at University of Washington’s DIRAC Institute for analysis of Zwicky Transient Facility (ZTF) and other datasets. AXS is capable of cross-matching Gaia DR2 (1.7 billion rows) and ZTF (2.9 billion rows) in 25 seconds (with data cached in filesystem) while users can further analyze results with custom Python code, within the same framework. AXS’ long-term goal is to become a preferred tool for individual researchers or groups when they need to analyze astronomical datasets, whether small or at petabyte scales.