The CRyPTIC Consortium Dataset
Authors/Creators
Description
This dataset processes all the raw genetics (FASTQ) files using a Mycobacterial pipeline as implemented in an online cloud platform. Whilst the bioinformatics components are similar (e.g. Clockwork remains the variant caller), there are some differences. This version includes all samples for which we expect to have WGS and pDST data.
This dataset contains the high-level data tables produced by the CRyPTIC Consortium. It contains information on a large number of M. tuberculosis complex samples that were collected and collated by the project. In total
- 53,897 samples have both WGS and pDST data.
- An additional 11,945 samples only have pDST data.
Due to the size of some of the data tables, the larger ones are stored as PyArrow parquet files. These can be e.g. loaded using pandas but one ordinarily needs to first install pyarrow using pip.
Files
DATA_SCHEMA.pdf
Files
(2.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:aa8b8eb4ac63d5473b2d8cc82a6f8a69
|
3.7 MB | Download |
|
md5:079413ecb6bc172ab97a0445f8556db5
|
162.1 MB | Download |
|
md5:693cbffe95d305499779e09d7bb903e6
|
4.6 kB | Download |
|
md5:8b11bfbb9255da6dfc1b8b7aede767d0
|
102.7 kB | Preview Download |
|
md5:923d3a193df21698bd6a00f857ab337e
|
385 Bytes | Download |
|
md5:45b4501ea7c3925af565dbbc6188dec0
|
2.7 MB | Download |
|
md5:afa5d4d67e0e9ea9e4f325f06d815e9a
|
820.5 kB | Download |
|
md5:15bc4e893c0dbcf417cd6e7c94e34517
|
5.9 MB | Download |
|
md5:ebd82e85f71e36de5da10e776b6afe4e
|
2.6 MB | Download |
|
md5:d5feeeae14304006ba67aaaef84cff03
|
1.1 GB | Download |
|
md5:cb403c7517ec847467b7980cbc3e5389
|
5.9 kB | Download |
|
md5:a9eff6b9c527b87a075f2973d052963c
|
2.4 MB | Download |
|
md5:dfac1d2007ad30bb749f7bb3bcb9645b
|
18.8 kB | Preview Download |
|
md5:c24c882c8988b5af9940232ada27fb60
|
1.3 kB | Download |
|
md5:403ba6d904a3846b8d2535be2de4785a
|
15.1 MB | Download |
|
md5:020b6c0af6c05e19610a59f5ef97b832
|
1.6 MB | Download |
|
md5:7b2880f45079c74e88aa4d12c1c18167
|
2.4 MB | Download |
|
md5:45dcd2628e147f90391fb37cdebb845b
|
1.3 GB | Download |
|
md5:ea798f4cfc28525cf394ff9196c93021
|
9.4 MB | Download |
Additional details
Funding
- Wellcome Trust
- The CRyPTIC Consortium 200205/Z/15/Z
- Bill & Melinda Gates Foundation
- The CRyPTIC Consortium OPP1133541