Finding Stellar Streams in the Milky Way with CWoLa
Authors/Creators
- 1. Lawrence Berkeley National Lab
- 2. University of California, Berkeley
- 3. Rutgers University
- 4. SLAC; Bosch Research
Description
These datasets are designed to accompany the paper Weakly-Supervised Anomaly Detection in the Milky Way by M. Pettee, S. Thanvantri, B. Nachman, D. Shih, M. R. Buckley, and J. H. Collins (https://arxiv.org/abs/2305.03761). They illustrate how the Classification Without Labels (CWoLa) technique can be applied in the search for localized anomalies -- in this case, cold stellar streams -- in the Gaia DR2 dataset. We consider both simulated stellar streams and the known stream GD-1. The codebase is located at https://github.com/hep-lbdl/GaiaCWoLa.
These datasets are all formatted as Pandas DataFrames, and can be loaded using `df = pd.read_hdf(file)`. Datasets ending in `*_cwola.h5*` indicate that CWoLa has been applied to the stars in that dataset. For this analysis, we consider circular "patches" of the sky of radius 15° from the Gaia DR2 dataset.
Columns include:
- μ_δ: Proper motion (declination)
- μ_α: Proper motion (right ascension)
- δ: Declination
- α: Right ascension
- α_wrapped: Right ascension, wrapped such that all angles are > 0
- b-r: Color
- g: Magnitude
- ϕ: Rotated & centered right ascension
- λ: Rotated & centered declination
- μ_ϕcosλ: Rotated & centered proper motion (right ascension)
- μ_λ: Rotated & centered proper motion (declination)
- stream: Boolean label (True if part of ground truth labeling for GD-1)
- patch_id: Index of the 21 patches of the GD-1 scan
- nn_score: CWoLa classifier score (0 = more background-like; 1 = more signal=like)
- 5d_distance: For the potential GD-1 member candidates, the 5D Euclidean distance to the nearest labeled star
Files include:
- gd1_1_patch.h5
- Example patch of GD-1
- gd1_21_patches_cwola.h5
- Results from applying CWoLa to the 21 patches spanning GD-1
- promising_stars.h5
- A list of stars identified by CWoLa that do not belong to the labeled GD-1 stream catalogue, but are close in 5D Euclidean space to labeled stars. Ordered in descending order by CWoLa classifier score.
- scan_bump_cwola.h5
- An example of how to apply CWoLa in a scan where the stream location is unknown
- scan_no_bump_cwola.h5
- An example of how to apply CWoLa in a scan where the stream location is unknown
- simulated_100_patches.h5
- Results from applying CWoLa to 100 simulated streams
- simulated_patch.h5
- A full simulated stream
- simulated_patch_cwola.h5
- Results from applying CWoLa to the example simulated stream
To recreate the full scan across GD-1, one would need to use the original 21 patches of Gaia DR2 located at 10.5281/zenodo.7897935 that were constructed for arXiv:2104.12789 [astro-ph.GA). GD-1 labels were derived from 10.5281/zenodo.1295543.
Files
Files
(3.5 GB)
| Name | Size | |
|---|---|---|
|
md5:bd0aca74f1b52d6bb2dc6019c028ecc5
|
90.2 MB | Download |
|
md5:0d0fce847a4770d02a88e399ca885aaa
|
763.7 MB | Download |
|
md5:be18aee66b211dbd17b52e66e4ed38e5
|
12.6 kB | Download |
|
md5:0028284526d046f013ec70274904ebd1
|
19.5 MB | Download |
|
md5:d00584d669250885511614e3c29419ec
|
120.0 MB | Download |
|
md5:21ed4762f42db7b0610c587776dc6553
|
2.3 GB | Download |
|
md5:a8b96fce7f954d9c97fa6cddbe6515e2
|
86.0 MB | Download |
|
md5:ac25438fbcae906581ebd55fc6781457
|
37.9 MB | Download |