There is a newer version of the record available.

Published May 21, 2025 | Version v3.4.0
Dataset Open

The CRyPTIC Consortium Dataset

  • 1. ROR icon University of Oxford

Description

This dataset processes all the raw genetics (FASTQ) files using a Mycobacterial pipeline as implemented in an online cloud platform. Whilst the bioinformatics components are similar (e.g. Clockwork remains the variant caller), there are some differences. This version includes all samples for which we expect to have WGS and pDST data.

This dataset contains the high-level data tables produced by the CRyPTIC Consortium. It contains information on a large number of M. tuberculosis complex samples that were collected and collated by the project. In total

  • 53,897 samples have both WGS and pDST data.
  • An additional 11,945 samples only have pDST data.

Due to the size of some of the data tables, the larger ones are stored as PyArrow parquet files. These can be e.g. loaded using pandas but one ordinarily needs to first install pyarrow using pip.

Files

DATA_SCHEMA.pdf

Files (2.6 GB)

Name Size Download all
md5:aa8b8eb4ac63d5473b2d8cc82a6f8a69
3.7 MB Download
md5:079413ecb6bc172ab97a0445f8556db5
162.1 MB Download
md5:693cbffe95d305499779e09d7bb903e6
4.6 kB Download
md5:8b11bfbb9255da6dfc1b8b7aede767d0
102.7 kB Preview Download
md5:923d3a193df21698bd6a00f857ab337e
385 Bytes Download
md5:45b4501ea7c3925af565dbbc6188dec0
2.7 MB Download
md5:afa5d4d67e0e9ea9e4f325f06d815e9a
820.5 kB Download
md5:15bc4e893c0dbcf417cd6e7c94e34517
5.9 MB Download
md5:ebd82e85f71e36de5da10e776b6afe4e
2.6 MB Download
md5:d5feeeae14304006ba67aaaef84cff03
1.1 GB Download
md5:cb403c7517ec847467b7980cbc3e5389
5.9 kB Download
md5:a9eff6b9c527b87a075f2973d052963c
2.4 MB Download
md5:dfac1d2007ad30bb749f7bb3bcb9645b
18.8 kB Preview Download
md5:c24c882c8988b5af9940232ada27fb60
1.3 kB Download
md5:403ba6d904a3846b8d2535be2de4785a
15.1 MB Download
md5:020b6c0af6c05e19610a59f5ef97b832
1.6 MB Download
md5:7b2880f45079c74e88aa4d12c1c18167
2.4 MB Download
md5:45dcd2628e147f90391fb37cdebb845b
1.3 GB Download
md5:ea798f4cfc28525cf394ff9196c93021
9.4 MB Download

Additional details

Funding

Wellcome Trust
The CRyPTIC Consortium 200205/Z/15/Z
Bill & Melinda Gates Foundation
The CRyPTIC Consortium OPP1133541