Published May 5, 2023 | Version v1

Finding Stellar Streams in the Milky Way with CWoLa

  • 1. Lawrence Berkeley National Lab
  • 2. University of California, Berkeley
  • 3. Rutgers University
  • 4. SLAC; Bosch Research

Description

These datasets are designed to accompany the paper Weakly-Supervised Anomaly Detection in the Milky Way by M. Pettee, S. Thanvantri, B. Nachman, D. Shih, M. R. Buckley, and J. H. Collins (https://arxiv.org/abs/2305.03761). They illustrate how the Classification Without Labels (CWoLa) technique can be applied in the search for localized anomalies -- in this case, cold stellar streams -- in the Gaia DR2 dataset. We consider both simulated stellar streams and the known stream GD-1. The codebase is located at https://github.com/hep-lbdl/GaiaCWoLa

These datasets are all formatted as Pandas DataFrames, and can be loaded using `df = pd.read_hdf(file)`. Datasets ending in `*_cwola.h5*` indicate that CWoLa has been applied to the stars in that dataset. For this analysis, we consider circular "patches" of the sky of radius 15° from the Gaia DR2 dataset. 

Columns include: 

  • μ_δ: Proper motion (declination)
  • μ_α: Proper motion (right ascension)
  • δ: Declination
  • α: Right ascension
  • α_wrapped: Right ascension, wrapped such that all angles are > 0 
  • b-r: Color
  • g: Magnitude
  • ϕ: Rotated & centered right ascension 
  • λ: Rotated & centered declination 
  • μ_ϕcosλ: Rotated & centered proper motion (right ascension) 
  • μ_λ: Rotated & centered proper motion (declination) 
  • stream: Boolean label (True if part of ground truth labeling for GD-1)
  • patch_id: Index of the 21 patches of the GD-1 scan
  • nn_score: CWoLa classifier score (0 = more background-like; 1 = more signal=like)
  • 5d_distance: For the potential GD-1 member candidates, the 5D Euclidean distance to the nearest labeled star

Files include:

  • gd1_1_patch.h5
    • Example patch of GD-1
  • gd1_21_patches_cwola.h5
    • Results from applying CWoLa to the 21 patches spanning GD-1 
  • promising_stars.h5
    • A list of stars identified by CWoLa that do not belong to the labeled GD-1 stream catalogue, but are close in 5D Euclidean space to labeled stars. Ordered in descending order by CWoLa classifier score. 
  • scan_bump_cwola.h5
    • An example of how to apply CWoLa in a scan where the stream location is unknown
  • scan_no_bump_cwola.h5
    • An example of how to apply CWoLa in a scan where the stream location is unknown
  • simulated_100_patches.h5
    • Results from applying CWoLa to 100 simulated streams 
  • simulated_patch.h5
    • A full simulated stream 
  • simulated_patch_cwola.h5
    • Results from applying CWoLa to the example simulated stream

To recreate the full scan across GD-1, one would need to use the original 21 patches of Gaia DR2 located at 10.5281/zenodo.7897935 that were constructed for arXiv:2104.12789 [astro-ph.GA). GD-1 labels were derived from 10.5281/zenodo.1295543

Files

Files (3.5 GB)

Name Size
md5:bd0aca74f1b52d6bb2dc6019c028ecc5
90.2 MB Download
md5:0d0fce847a4770d02a88e399ca885aaa
763.7 MB Download
md5:be18aee66b211dbd17b52e66e4ed38e5
12.6 kB Download
md5:0028284526d046f013ec70274904ebd1
19.5 MB Download
md5:d00584d669250885511614e3c29419ec
120.0 MB Download
md5:21ed4762f42db7b0610c587776dc6553
2.3 GB Download
md5:a8b96fce7f954d9c97fa6cddbe6515e2
86.0 MB Download
md5:ac25438fbcae906581ebd55fc6781457
37.9 MB Download