Tutorial for Analyses Based on GWPopulation

A component of the data release for "The Population of Merging Compact Binaries Inferred Using Gravitational Waves Through GWTC-3"

Authors: Bruce Edelman, Amanda Farah, and Jacob Golomb on behalf of the LIGO Scientific Collaboration, Virgo Collaboration and KAGRA Collaboration

This software is provided under license: Creative Commons Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/legalcode)

This notebook is intented to show readers how to load in the data associated with some of the gwpopulation-based models used in "The population of merging compact binaries inferred using gravitational waves through GWTC-3" (herein "the companion paper"). Here, we will show you how to load in hyperposterior samples, plot probability distributions as a function of source parameters, and make less-polished versions of the plots in the paper. The models highlighed in this particular tutorial are:

These are all included in or built off of the open source python package, gwpopulation, which includes several other parameterized distributions as well, many of which are highlighted in "Population Properties of Compact Objects from the Second LIGO-Virgo Gravitational-Wave Transient Catalog." Note that other parameterized (MultiSource and NS mass models) and non-parametric (Binned Gaussian Process, Flexible Mixtures) models not based on gwpopulation exist and are used in the companion paper. There are separate tutorials for those.

This notebook will not show you how to run the population analysis. If you are intersted in this, please look at the documention for gwpopulation: https://colmtalbot.github.io/gwpopulation/

We will begin by reading in the results of the population inference on our default event list for the BBH population. In this example, we will read in the results for the Powerlaw + Peak model. These results are stored as bilby result files in this data repository. A description of the parameters used in this model can be found in Appendix B1 of the companion paper.

Reading in Hyperposterior Samples

There is a lot of information in the Result object. Lets take a look at some useful attributes.

What you are probably most intersted in are the hyperposterior samples, which are saved as a pandas dataframe.

We can pull 1D marginal distributions from this by picking out a specific column. As an example, lets try the powerlaw slope on m1

We can plot all of these samples with their many correlations using a corner plot:

More documentation on how to make these corner plots to fit your needs can be found here (see also: plot_multiple).

Plotting Differential Merger Rates vs Source Parameters

While informative, these corner plots are not very intuitive to discern useful information. Each sample in the above distribution corresponds to a unique mass, spin, and redshift population distribution. A more informative demonstration of the significance of this posterior distribution is to plot the inferred distributions in parameter space; that is, given the above draws from the posterior distribution, what do the resulting inferred mass, spin, and redshift distributions look like?

We have stored these distributions for the some draws from the hyperposterior in the following hdf5 files in this repository:

Mass: o1o2o3_mass_c_iid_mag_iid_tilt_powerlaw_redshift_mass_data.h5

Spin Magnitude: o1o2o3_mass_c_iid_mag_iid_tilt_powerlaw_redshift_orientation_data.h5

Redshift: o1o2o3_mass_c_iid_mag_iid_tilt_powerlaw_redshift_redshift_data.h5

Begin by reading in the mass distribution-make sure the path below points to the above files in the /analyses/PowerLawPeak subdirectory of the data release!

The HDF5 file for the mass distribution contains both the individual merger rate vs mass lines determined by the population parameters in the posterior distribution, as well as the resulting Posterior Predictive Distribution (PPD), the "average" rate distribution as determined by the draws of the population parameters.

Recall that we have defined the Powerlaw + Peak mass model in terms of two separable one-dimensional distributions: a probability distribution of m1 (primary mass) and distribution for q=m2/m1 (mass ratio). Thus, to get the inferred distribution for one variable we must marginalize over the other. We will do exactly this here; plot the inferred primary mass distribution and mass ratio distribution by marginalizing over the other in turn. Note that what we have stored in these files and what we will be plotting is the differential rate, which is closely related to the probability density in terms of the mass, spin, or redshift.

If you would like the following figures saved, set "outdir" to your desired output directory and set save=True when we call the below function. Let's begin by defining the mass distribution plotting function.

Now let's actually call the above plotting script to generate the inferred mass distribution for the primary mass and mass ratio. Note we set the credibility interval we wish to plot via the "limits" argument in the plotting function. Let's plot the 90% credible region.

Looks good! Compare this with Figure 10 in the companion paper. Now we can make a similar plot of the spin magnitude distribution (in terms of $a_1$ and $a_2$, the dimensionless spin magnitude of the primary and secondary black hole in a binary, respectively).

Again, we begin by noticing we have the set of "lines" and the PPD data, as we had with the mass distribution HDF5 file.

Again, we will plot the 90% credible region of the inferred distribution:

Finally, we can plot the inferred redshift distribution: the probability of a merger occuring at a particular redshift (or distance) based on our inferred evolution of merger rate with redshift. This will follow the above process closely. We begin by loading the path:

Compare this with Figure 13 in the companion paper. Note we see an interesting result here: The rate is decidely increasing with z. This means that we infer that the rate of BBH mergers is larger the farther away we look. If we plot a straight line over this plot, corresponding to a merger rate that does not evolve with redshift, we notice it lies well outside the inferred 90% credible region

We can now apply these skills to the other gwpopulation-based models. Lets try Power Law + Dip + Break

Power Law + Dip + Break

Outputs from this model are stored in analyses/PowerlawDipBreak/. Power Law + Peak was labeled as mass_c, and Power Law + Dip + Break runs are labeled as mass_g and mass_h. g is for the independent pairing model (Eq. 10), whereas h is for the power-law-in-mass-ratio pairing model (Eq. 11).

We will start by looking at the full set of hyperparameters for the independent pairing model.

Note that for the sake of comparison with other joint analyses, runs of this model set the spin distribution to be uniform in magnitude and isotropic in orientation, so the spin parameters are not fit and therefore show uninformative posteriors. This is expected, but ugly to look at. We can ignore them by specifying only the mass distribution parameters.

Now for the mass spectra.

PDB is parameterized in terms of primary and secondary mass (m1 and m2) rather than primary mass and mass ratio. Therefore, the plotting script will be a bit different, so we will write a new one.

Also, the contents of the h5 files containing all of the traces are a little different for PDB.

So we see that we have a few options. We can plot the mass distribution as a function of m1 (p_mass_1), the mass distribution as a function of m2 (p_mass_2), the mass distribution as a function of generic component mass (pdf), or the differential merger rate as a function of either m1 or generic component mass. Here, 'mass' is the 1D grid of masses on which the other entries are evaluated. mass is NOT p(m1), unlike for Powerlaw + Peak and other models parameterized by m1 and q.

Compare this with Figure 5.

If we also want to get the locations of the gap edges (or any other feature, for that matter), we can pull those from the hyperposterior.

As an example, let's slap the medians on the above plot. You could also take whichever percentiles you like, or a maximum posterior value. It is up to you.

Now let's look at one of the other, more flexible, gwpopulation-based models used to analyze the BBH mass spectrum, the Power Law + Spline (PS) model. A description of the model and parameters used in these analyses can be found in Appendix B1 of the companion paper.

Power Law + Spline

Outputs from this model are stored in analyses/PowerlawSpline/. Power Law + Spline was labeled as mass_m, for the particular choice of 20 knots used in this analysis.

Lets start by loading in the result file the same as the previous models.

Now lets plot the (LARGE) corner plot. The PS mass model has 18 parameters that control the knot heights for the 18 interior knots (f1-f18). This one takes a while because there are so many parameters...

Again let's load in the mass_data.h5 file to plot the mass spectrum. This should be identical in format to the mass_PP_path file used above.

Now we can use the mass_spectrum_plot from above to overlay both the PS and PP mass distributions on top of each other, to get a good comparison of where they disagree.

Now lets look at specifically what the spline perutbrations is doing to the underlying powerlaw in our PS model. There is a perturbation applied from 2-100 solar masses with cubic splines, lets generate a plot similar to Figure 12 in the companion paper.