Published May 19, 2021 | Version 1.0.initial_submission
Dataset Open

Additional supplementary data for "Uncertainty-aware and interpretable evaluation of Cas9-gRNA and Cas12a-gRNA specificity for fully matched and partially mismatched targets with Deep Kernel Learning""

  • 1. Skolkovo Institute of Science and Technology
  • 2. National Center of Biotechnology Information

Description

Additional supplementary data for the paper.

ClusterLogos.pdf - The logo sequences for each cluster of the gRNAs from LOC440792 gene.

gRNA.xlsx -The set of gRNAs found in Chromosome 22 of HG38 human reference genome with corresponding estimations of cleavage efficiency and variance. A single Excel file with 16 sheets - one for each on-target model trained (DeepHF and DeepCpf1 datasets). In each sheet, there are the following fields:

1. gene - the identificator of the gene;

2. position - starting position of the gRNA target;

3. gRNA - the sequence of gRNA target;

4. strand - the strand of DNA;

5. mean - the mean cleavage efficiency estimated by a model;

6. variance - the variance of cleavage efficiency prediction.

Files

ClusterLogos.pdf

Files (87.1 MB)

Name Size Download all
md5:6e4734a7227160a56961b4fe85696d2c
785.8 kB Preview Download
md5:fb41a8d5b658092a62fda59423929195
86.3 MB Download