There is a newer version of the record available.

Published April 14, 2023 | Version v2
Dataset Open

Simulated datasets for comparsion between fastBCR and other SOTA methods.

  • 1. Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China

Description

Simulated datasets for comparison between fastBCR and other state-of-the-art (SOTA) BCR clonal family inferencing methods. The simulation process began by generating an ancestor cell through V(D)J random recombinant and simulated the process of antigen activation that led to multiple rounds of proliferation, mutation in the junction region and elimination, ultimately resulting in the formation of a B cell clonal family. We set a baseline mutation rate of approximately 10−3 mutations per base pair per cell division, as reported in previous studies (Kleinstein, S. H., Louzoun, Y. & Shlomchik, M. J. Estimating hypermutation rates from clonal tree data. The Journal of Immunology 171, 4639-4649 (2003). ). To account for the impact of low cell capture efficiency and potential sequencing errors, we further set three higher mutation rates (0.002, 0.005, and 0.01) in the simulation process. To investigate the performance of different methods across varying clonal family densities and mutation rates, we employed the above framework to generate simulated datasets comprising 10, 30, 50 and 100 clonal families. Additionally, each dataset was augmented with 20,000 random simulated ancestor sequences as noises.

Files

simulated_data.zip

Files (69.4 MB)

Name Size Download all
md5:3c0293d2f79f680d08c1b25b70a9a081
69.4 MB Preview Download