Journal article Open Access

Collection of codes and annotated matrix for the paper "A cell atlas of human thymic development defines T cell repertoire formation"

Jong-Eun Park; Sarah Teichmann; Muzlifah Haniffa; Tom Taghon

This is the collection of codes and annotated matrix described in the paper “A cell atlas of human thymic development defines T cell repertoire formation”

 

This repository contains:

  • 'scjp' package to assist single-cell data analysis
  • jupyter notebooks which show the process of analysis for all figures
  • annotated, normalised matrix in h5ad format
  • csv files containing the metadata
  • raw count matrix in h5ad format
  • vdj files (cellranger output)

 

The following is description for each item:

"sample_metadata_fix.xlsx" is:

  • metadata for all samples generated
  • contains file key for gene expression and vdj data matching (the error in previous versions are fixed here!)

"thymus_code_package.zip" contains:

  • *.ipynb: jupyter notebooks describing the analysis
  • F00_global_variables.py: contains global variables shared across multiple notebooks
  • scjp: python package to support the single-cell analysis (not for the distribution, there are some dependency issue that needs to be fixed. Final version is under-preparation.)

"thymus_annotated_matrix_files.zip" contains:

  • *.csv: metadata including annotation per cell (.obs in scanpy anndata)
  • *.h5ad: anndata containing matrix for normalised read counts, metadata including annotation per cell (use python scanpy package for navigation. See 'Data_navigator.ipynb' for tutorial)
  • Data_navigator.ipynb: jupyter notebook describing each dataset

"HTA07.A01.v02.entire_data_raw_count.h5ad" is:

  • raw count matrix for all human data generated in the study
  • can be matched with annotated dataset (*.h5ad files described above) by observation names
  • 'adata.obs_names' is in format of '{filename}-{cellbarcode}'

"thymus_vdj.zip" contains:

  • cellranger output *.csv files for VDJ data analysis
  • vdj files can be matched to gene expression files based on the information in 'sample_metadata.xlsx'
    • they should share the same cell barcode

Please also check https://github.com/Teichlab/thymusatlas for future updates

This github repository will be used to update any additional materials which are not covered in here.

Please contact to: jp24@sanger.ac.uk or jepark87@gmail.com for any questions

Files (4.7 GB)
Name Size
HTA07.A01.v02.entire_data_raw_count.h5ad
md5:496cfaeb6d227b2bb32d52abd2249085
1.6 GB Download
sample_metadata_fix.xlsx
md5:16fcb558c27016665f0c946d29c2b9aa
24.0 kB Download
thymus_annotated_matrix_files.zip
md5:5eb2cb0ccc7f87dcddb68d785517de9d
2.5 GB Download
thymus_code_package.zip
md5:57d189b2c87501a63d2771776df4f2c8
137.1 MB Download
thymus_vdj.zip
md5:ef8067b153673f5edab89a73c0a189da
493.3 MB Download
2,654
3,890
views
downloads
All versions This version
Views 2,654169
Downloads 3,890131
Data volume 6.9 TB155.8 GB
Unique views 2,102141
Unique downloads 1,18858

Share

Cite as