Published October 15, 2021 | Version DR1 (v8.0)
Dataset Open

Transient Host Exchange


The First Public Data Release (DR1) of Transient Host Exchange (THEx) Dataset

Paper describing the dataset: “Linking Extragalactic Transients and their Host Galaxy Properties: Transient Sample, Multi-Wavelength Host Identification, and Database Construction” (Qin et al. 2021)

The data release contains four compressed archives.

“BSON export” is a binary export of the “host_summary” collection, which is the “full version” of the dataset. The schema was presented in the Appendix section of the paper.

You need to set up a MongoDB server to use this version of the dataset. After setting up the server, you may import this BSON file into your local database as a collection using “mongorestore” command.

You may find some useful tutorials for setting up the server and importing BSON files into your local database at:

You may run common operations like query and aggregation once you import this BSON snapshot into your local database. An official tutorial can be found at:

There are other packages (e.g., pymongo for Python) and software to perform these database operations.

“JSON export” is a compressed archive of JSON files. Each file, named by the unique id and the preferred name of the event, contains complete host data of a single event. The data schema and contents are identical to the “BSON” version.

“NumPy export” contains a series of NumPy tables in “npy” format. There is a row-to-row correspondence across these files. Except for the “master table” (THEx-v8.0-release-assembled.npy), which contains all the columns, each file contains the host properties cross-matched in a single external catalog. The meta info and ancillary data are summarized in THEx-v8.0-release-assembled-index.npy.

There is also a THEx-v8.0-release-typerowmask.npy file, which has rows co-indexed with other files and columns named after each transient type. The “rowmask” file allows you to select a subset of events under a specific transient type.

Note that in this version, we only include cataloged properties of the confirmed hosts or primary candidates. If the confirmed host (or primary candidate) cross-matched multiple sources in a specific catalog, we only use the representative source for host properties. Properties of other cross-matched groups are not included. Finally, table THEx-v8.0-release-MWExt.npy contains the calculated foreground extinction (in magnitudes) at host positions. These extinction values have not been applied to magnitude columns in our dataset. You need to perform this correction by yourself if desired.

“FITS export” includes the same individual tables as in “NumPy export”. However, the FITS standard limits the number of columns in a table. Therefore, we do not include the “master table” in “FITS export.”


Finally, in BSON and JSON versions, cross-matched groups (under the “groups” key) are ordered by the default ranking function. Even if the first group in this list (namely, the confirmed host or primary host candidate) is a mismatched or misidentified one, we keep it in its original position. The result of visual inspection, including our manual reassignments, has been summarized under the “vis_insp” key.

For NumPy and FITS versions, if we have manually reassigned the host of an event, the data presented in these tables are also updated accordingly. You may use the “case_code” column in the “index” file to find the result of visual inspection and manual reassignment, where the flags for this “case_code” column are summarized in case-code.txt. Generally, codes “A1” and “F1” are known and new hosts that passed our visual inspection, while codes “B1” and “G1” are mismatched known hosts and possibly misidentified new hosts that have been manually reassigned.



Files (10.0 GB)

Name Size Download all
3.2 kB Preview Download
4.2 GB Download
198.3 MB Download
5.2 GB Download
401.5 MB Download