Published September 26, 2022 | Version 0.0.0
Dataset Open

Sei whole-genome sequence class annotations

  • 1. Princeton University

Description

Sei sequence class whole-genome annotations are available in the following files:

  • sorted.hg38.tiling.bed.ipca_randomized_300.labels.merged.bed - The sorted, merged sequence class assignments from Louvain community clustering of the 30 million sequences, uniformly tiling the whole human genome. The fourth column is the sequence class number, with any sequence classes numbering 40-61 excluded from our analyses in the publication. Sequence classes 0-39 can be mapped to the following labels: https://github.com/FunctionLab/sei-framework/blob/main/model/seqclass.names

  • sorted.hg19.tiling.bed.ipca_randomized_300.labels.merged.bed - lifted over version of the hg38 BED file.

Files

Files (524.6 MB)