Published January 21, 2025 | Version v1.0.0
Dataset Open

Data, code, and model weights for "Insights on Galaxy Evolution from Interpretable Sparse Feature Networks"

Creators

  • 1. ROR icon Space Telescope Science Institute

Description

Overview

This repository contains data, code, and model weights for reproducing the main results of the paper, Insights on Galaxy Evolution from Interpretable Sparse Feature Networks (see arXiv preprint). Specifically, we provide data files (images-sdss.tar.gz and galaxies.csv), a snapshot of the code base (sparse-feature-networks v1.0.0), and model weights (resnet18-topk_4-metallicity.pth, resnet18-topk_4-bpt_lines.pth). These are described in detail below.

Data

galaxies.csv is the main galaxy sample after we have issued the cuts described in the paper (250,224 rows). We include 30 columns queried from the SDSS galSpecInfo, galSpecLine, and galSpecExtra tables: 

objID (int64)
DR7ObjID (int64)
specObjID (int64)
ra (float32)
dec (float32)
z (float32)
zErr (float32)
velDisp (float32)
velDispErr (float32)
modelMag_u (float32)
modelMag_g (float32)
modelMag_r (float32)
modelMag_i (float32)
modelMag_z (float32)
petroMag_r (float32)
petroR50_r (float32)
petroR90_r (float32)
bptclass (int32)
oh_p50 (float32)
lgm_tot_p50 (float32)
sfr_tot_p50 (float32)
nii_6584_flux (float32)
nii_6584_flux_err (float32)
h_alpha_flux (float32)
h_alpha_flux_err (float32)
oiii_5007_flux (float32)
oiii_5007_flux_err (float32)
h_beta_flux (float32)
h_beta_flux_err (float32)
reliable (int32)

images-sdss.tar.gz is a compressed directory containing 250,224 image cutouts from the DESI Legacy Imaging Surveys viewer. Each cutout was generated using the RESTful call http://legacysurvey.org/viewer/cutout.jpg?ra={ra}&dec={dec}&pixscale=0.262&layer=sdss&size=160 where the ra and dec are directly taken from galaxies.csv. Each image is name using the format {objID}.jpg, again taken from galaxies.csv.

Code

The code is a snapshot of https://github.com/jwuphysics/sparse-feature-networks at v1.0.0. After unpacking the images and moving them into the ./data directory, the directory structure should look like:

./
├── data/
│   ├── images-sdss/
│   └── galaxies.csv
├── model/
├── results/
└── src/
    ├── config.py          
    ├── dataloader.py     
    ├── model.py         
    ├── main.py             
    └── trainer.py 

In order to run the analysis and reproduce the main results of the paper, you must create the software environment first:

pip install torch fastai numpy pandas matplotlib cmasher tqdm

and then simply run python src/main.py.

Models

The trained model weghts (resnet18-topk_4-metallicity.pth, resnet18-topk_4-bpt_lines.pth) are provided here for reproducing the exact results from the paper. These are compatible with the ResNet18TopK class defined in src/model.py, and the weights can be stored in the ./model directory.

Alternatively, you can train your own models (i.e. by using the functions defined in src/trainer.py) and save them natively with Pytorch. 

Files

galaxies.csv

Files (3.5 GB)

Name Size Download all
md5:d6d943ebfd1b3e91786fe2a68729d129
89.9 MB Preview Download
md5:5d21a7bd9fe0081d96986838e3fa01f2
3.3 GB Download
md5:5a5382e205ff46a7542a93b558a5440e
44.8 MB Download
md5:f5c0ad9c3f22e283a37e72934607f8d1
44.8 MB Download
md5:86ba255823d54999ea52d45a68bedf75
596.2 kB Preview Download

Additional details

Software