Published April 14, 2025 | Version v2
Dataset Open

Murine datasets and computational analysis: Prdm16_pos antigen-presenting cells

Authors/Creators

  • 1. ROR icon New York University School of Medicine

Description

Prdm16-dependent antigen-presenting cells induce tolerance to gut antigens

This repository contains all code and computational analysis for the murine datasets associated with the above paper. Items are organized as follows:

  • 2024-04-26/ contains Delta+7kb mouse model raw sequencing data.
  • 2024-06-21/ contains Rorc(t)_DeltaCD11c mouse model raw sequencing data.
  • 2024-10-16/ contains Multiome (RNA+ATAC) raw sequencing data.
  • 2022AkagbosuEtAl/ contains the Multiome public dataset from PMID 36070798 that is integrated alongside our similar data. (Generated by downloading from GEO repository GSE174405 and initial processing through CellRanger with default parameters).
  • 2014-11-26/ contains bulk RNA sequencing data in bigWig format, as input for the Broad IGV tool, to ultimately generate Figure 1b.
  • Delta7kB_Murine_Analysis/ contains all code in R to process finalized datasets, ultimately generating Figures 2a-c, and ExtData Fig 4a-e.
  • Cd11c_Murine_Analysis/ contains all code in R to process finalized datasets, ultimately generating Figure 2d.
  • Multiome_Murine_Analysis/ contains all code in R to process finalized datasets, ultimately generating Figure 2e, ExtData Fig 4f-g, and ExtData Fig 5a-h
  • OUTPUT/ contains final RDS files as output by each workflow, to use as input for each Analysis script. A reader may choose to skip the majority of the above workflow, and only download these files to immediately generate the Figures and/or further explore the data.

In general, a reader can follow the A_import scripts to create initial Seurat objects, B_integrate scripts to integrate separate sequencing runs, C_annotate scripts to assign finalized cell types, and D_analysis scripts to generate all plots. Each workflow ostensibly only requires establishing certain library dependencies, as well as filepaths for INPUT and OUTPUT directories.

Please note, several of the Integration scripts are particularly computationally intensive.  These were successfully run by allocating 512 Gb of memory within our university’s supercomputing environment.

As emphasized within code annotations, this entire workflow is run with R 4.3.2 and Seurat 5.1.0— our computation makes use of that FindClusters implementation, a shared nearest neighbor modularity optimization based clustering algorithm (Seurat 5.2 is not compatible). We specify the Leiden algorithm at these steps, which differs from the default Louvain algorithm, which offers certain computational advantages (see PMID 30914743). While results with Louvain would likely reach the same biological conclusions, the final clusters and annotations we provide here are only useful with Leiden implementation. Finally, we are grateful to our institution’s supercomputer admins, who helped establish a virtual environment with R-Reticulate, which allows for calling the Leiden algorithm (implemented in Python). Readers will similarly need to establish this dependency based on their local environment.

 

Files

2024-04-26.zip

Files (20.6 GB)

Name Size Download all
md5:3ee11e2862cf09da8d9b2ee2a83638fd
773.4 MB Preview Download
md5:3dad04327cf7ff29e9d50b15de2a052b
2.4 GB Preview Download
md5:20757be05d7f7474188529827b176cfb
94.6 MB Preview Download
md5:cf627e75e4a0a9e9d9a763c072a0bba3
103.1 MB Preview Download
md5:1cb81734fc9198d35bfaa30ca4446604
4.1 GB Preview Download
md5:b85e0be8828c2a4e0009ebdf7b04ccac
9.4 kB Preview Download
md5:c612d6605672e458d515c10c627cd20d
11.0 kB Preview Download
md5:69d11d64df63155c05a8b1920a74e4af
12.8 kB Preview Download
md5:023a34e3854d39e2f3b068eefdb4d8fc
13.1 GB Preview Download