Published November 27, 2023 | Version v3
Conference paper Open

Polygraph: A Software Framework for the Systematic Assessment of Synthetic Regulatory DNA Elements

  • 1. Department of Biology Research | AI Development, Genentech Research and Early Development
  • 2. College of Computing, Georgia Institute of Technology

Description

The design of regulatory elements is pivotal in gene and cell therapy, where DNA sequences are engineered to drive elevated and cell-type specific expression. However, the systematic assessment of synthetic DNA sequences without robust metrics and easy-to-use software remains challenging. Here, we introduce Polygraph, a Python framework that evaluates synthetic DNA elements, based on features like diversity, motif and k-mer composition, similarity to endogenous sequences, and screening with predictive and foundational models. Polygraph is the first instrument for assessing synthetic regulatory sequences, enabling faster progress in therapeutic interventions and improving our understanding of gene regulatory mechanisms.

Code can be found at the following repository: github.com/Genentech/polygraph

Files

human_seqs.txt

Files (363.1 MB)

Name Size Download all
md5:b9d725df369863522c2f3721e7a7c616
61.8 MB Download
md5:f79ccc6d03ad131b23c5977743bc0e3c
288.4 MB Download
md5:7afe4fcc52265fc7d22c58d8e442b4d9
2.7 MB Preview Download
md5:74bb6bb09a482b0dae70cac229f0edaf
137.0 kB Preview Download
md5:2df42fb2f2771cc961f182cd40ec38d9
10.0 MB Download
md5:1075d7849c9effc8f86af8f8addb7035
27.6 kB Preview Download

Additional details

Related works

Is referenced by
https://github.com/Genentech/polygraph (URL)
Is supplement to
10.1101/2023.11.27.568764 (DOI)

Dates

Updated
2023-11-28