----------------- v1.1.0 2025-09-25 ----------------- -------------------- Corresponding author -------------------- Stephen Thorp --------- Changelog --------- v1.0.0 - Initial upload v1.0.1 - Minor tweaks to README V1.1.0 - Adding rest-frame NUVrJ photometry from Deger et al. (2025) - Adding version key to the HDF5 attributes ----------- Description ----------- This Zenodo record contains data products for the paper "pop-cosmos: Insights from generative modeling of a deep, infrared-selected galaxy population" by Thorp et al. (2025). Included in the record are two mock galaxy catalogs of 2 million galaxies each, sampled from the pop-cosmos generative model that was calibrated on COSMOS2020 data with IRAC Ch.1 < 26. The paper contains full details of the analysis that lead to these products, with a small number of additional data products added by Deger et al. (2025) and described therein. If you use these data please cite: J. Alsing et al. (2024). ApJS, 274, 12. [arXiv:2402.00935][doi:10.3847/1538-4365/ad5c69] S. Thorp et al. (2025). ApJ, accepted. [arXiv:2506.12122] S. Deger et al. (2025). MNRAS, submitted. ---------------------- mock_catalog_Ch1_26.h5 ---------------------- This is a HDF5 formatted binary file containing a COSMOS-like mock galaxy catalog with 2 million galaxies sampled subject to an IRAC Ch1<26 magnitude limit. The file can be opened with h5py. The structure of the file is as follows: - "mock_catalog_Ch1_26.h5/" - "fluxes" | Array of noisy model fluxes, shape (2000000, 26), units Mgy. - "fluxes_noiseless" | Array of noiseless model fluxes, shape (2000000, 26), units Mgy. - "flux_sigmas" | Array of flux uncertainties, shape (2000000, 26), units Mgy. - "magnitudes_log" | Array of noisy model magnitudes (logarithmic, AB system). - "magnitudes_asinh" | Array of noisy model magnitudes (asinh, AB system). - "magnitudes_rest" | Array of rest-frame absolute magnitudes (logarithmic, AB system). - "sps_parameters" | Array of base SPS model parameters, shape (2000000, 16). - "derived_parameters" | Array of derived model parameters, shape (2000000, 4). - "mass_complete" | Boolean array indicating pass/fail of mass completeness limit. - "mass_995" | Boolean array indicating if a galaxy is below the 99.5th %ile in mass. In addition to these arrays, the HDF5 file has attributes (accessed in h5py via .attrs) with the following keys: - "version" | Zenodo version number for this file (v1.1.0 onwards). - "created" | Date the mock catalog was created. - "band_list" | Filter names in the order of columns in "fluxes", etc. - "band_list_rest" | Filter names in the order of columns in "magnitudes_rest" - "sps_parameter_list" | Parameter names in the order of columns in "sps_parameters". - "derived_parameter_list" | Parameter names in the order of columns in "derived_parameters". - "reference_band" | Selection band for the catalog. - "reference_band_limit_log" | Limiting log magnitude in selection band. - "reference_band_limit_asinh" | Limiting asinh magnitude in selection band. - "magnitude_unit" | Magnitude system. - "flux_unit" | Flux units. - "flux_softening" | Softening parameter used in asinh magnitude conversion. The "fluxes", "fluxes_noiseless", "magnitudes_log" and "magnitudes_asinh" arrays contain model photometry in 26 photometric passbands used in the COSMOS2020 survey. These arrays have shape (2000000, 26). The order of columns can be found in the attributes dictionary under the key "band_list". It is also reproduced below: 0 | u (CFHT/MegaCam) 1 | g (Subaru/HSC) 2 | r (Subaru/HSC) 3 | i (Subaru/HSC) 4 | z (Subaru/HSC) 5 | y (Subaru/HSC) 6 | Y (UltraVISTA) 7 | J (UltraVISTA) 8 | H (UltraVISTA) 9 | Ks (UltraVISTA) 10 | IB427 (Subaru/Suprime-Cam) 11 | IB464 (Subaru/Suprime-Cam) 12 | IA484 (Subaru/Suprime-Cam) 13 | IB505 (Subaru/Suprime-Cam) 14 | IA527 (Subaru/Suprime-Cam) 15 | IB574 (Subaru/Suprime-Cam) 16 | IA624 (Subaru/Suprime-Cam) 17 | IA679 (Subaru/Suprime-Cam) 18 | IB709 (Subaru/Suprime-Cam) 19 | IA738 (Subaru/Suprime-Cam) 20 | IA767 (Subaru/Suprime-Cam) 21 | IB827 (Subaru/Suprime-Cam) 22 | NB711 (Subaru/Suprime-Cam) 23 | NB816 (Subaru/Suprime-Cam) 24 | Ch1 (Spitzer/IRAC) 25 | Ch2 (Spitzer/IRAC) The rest-frame absolute magnitudes contained in the "magnitudes_rest" array are for a different subset of photometric passbands, used in the analysis of Deger et al. (2025). This array has shape (2000000, 3). The order of columns is in the "band_list_rest" attribute, and is also given below: 0 | NUV (GALEX) 1 | r (Subaru/HSC) 2 | J (UltraVISTA) The "sps_parameters" array contains the 16 base SPS parameters of each model galaxy. The array has shape (2000000, 16). The order of columns can be found in the attributes dictionary under the key "sps_parameter_list". It is also reproduced below: 0 | log10M_formed (stellar mass formed; units of solar masses) 1 | log10Z (stellar metallicity; units of solar metallicity) 2 | log10sfr_ratio1 (SFR ratio between bin 1 and 2 of SFH) 3 | log10sfr_ratio2 (SFR ratio between bin 2 and 3 of SFH) 4 | log10sfr_ratio3 (SFR ratio between bin 3 and 4 of SFH) 5 | log10sfr_ratio4 (SFR ratio between bin 4 and 5 of SFH) 6 | log10sfr_ratio5 (SFR ratio between bin 5 and 6 of SFH) 7 | log10sfr_ratio6 (SFR ratio between bin 6 and 7 of SFH) 8 | dust2 (optical depth of diffuse dust) 9 | dust_index (power law index of diffuse dust attenuation law) 10 | dust1_fraction (ratio of birth cloud to diffuse dust attenuation) 11 | lnfAGN (ratio of AGN bolometric luminosity to stellar luminosity) 12 | lntauAGN (optical depth of AGN dust torus) 13 | log10Zgas (gas-phase metallicity; units of solar metallicity) 14 | log10Ugas (gas ionization) 15 | z (redshift) The "derived_parameters" array contains 4 derived parameters each model galaxy. The array has shape (2000000, 4). The order of columns can be found in the attributes dictionary under the key "derived_parameter_list". It is also reproduced below: 0 | age (mass weighted age; units of Gyr) 1 | log10M_remain (stellar mass remaining; units of solar masses) 2 | log10SFR (star formation rate; units of solar masses per year) 3 | log10sSFR (specific SFR; units of solar masses per year per unit mass remaining) -------------------- mock_catalog_r_25.h5 -------------------- As above, but with 2 million model galaxies sampled subject to an r<25 magnitude limit. Note that this catalog currently does not include the rest-frame NUVrJ magnitudes (i.e. it is without the "magnitudes_rest" and "band_list_rest" keys). -------------------------------- Using h5py and mock_catalog_*.h5 -------------------------------- To work with mock_catalog_*.h5 in Python, we recommend using h5py. This can be obtained from pip or conda: $ pip install h5py or $ conda install h5py To open a HDF5 file in Python, you can do the following: >>> import h5py >>> f = h5py.File("mock_catalog_Ch1_26.h5", "r") This will open the file object, but won't load everything into memory at once. Things will be loaded into memory only when you access them. We can see the list of available data by doing this: >>> print(f.keys()) And we can see the available auxilliary information in the .attrs by doing this: >>> print(f.attrs.keys()) Say we want to look at the redshift distribution of the mock catalog. We can work out which column of the SPS parameters is of interest by printing out the order: >>> print(f.attrs["sps_parameter_list"]) From this (or from the info in this README), we can see that column 15 is the one we want. We can extract the redshifts like this: >>> redshifts = f["sps_parameters"][:,15] This will load all 2 million model redshifts into memory. Say we're also interested in the HSC r-band magnitudes of the model galaxies. We can find out the relevant column of the photometry from the list in this README, or by querying the .attrs: >>> print(f.attrs["band_list"]) We'll see that column 2 is the one we want. The r-band magnitudes can then be found like this: >>> mags_r = f["magnitudes_log"][:,2] This will load the r-band magnitudes for the 2 million model galaxies into memory. An interactive demonstration can be found at https://github.com/Cosmo-Pop/pop-cosmos.