Published January 8, 2026 | Version v1
Preprint Open

RVAT: a unified framework to discover & interpret rare variant associations in large DNA sequencing datasets

  • 1. Department of Translational Neuroscience, UMC Utrecht Brain Center, University Medical Center Utrecht, 3584 CG Utrecht, the Netherlands.
  • 2. Department of Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht, 3584 CX Utrecht, the Netherlands.
  • 3. Genetic Epidemiology, Department of Psychiatry, Amsterdam UMC, location University of Amsterdam, Amsterdam, the Netherlands.

Description

The proliferation of whole-genome sequencing has transformed our ability to study how rare variants contribute to health and disease. This creates new opportunities to map disease modifying genes, resolve variants of unknown significance and to discover the aggregate effects of hidden rare variant associations on biological pathways and cell types. With this, there is an increasing need for accessible user-friendly data infrastructures and software tools that efficiently store, query, analyze and interpret these data. We developed RVAT (Rare Variant Association Toolkit) as a one-stop solution to address these needs and perform a comprehensive and customizable range of rare variant analyses and visualizations. RVAT is embedded in the Bioconductor ecosystem and uses a compressed out-of-memory data structure based on SQLite to facilitate efficient integration of large sequencing datasets with variant and sample annotations. The file format is complemented by object types and functions that support single variant, gene level, gene partitioning and gene set analyses through both R and command-line interfaces. We demonstrate the utility of RVAT in bridging the gap between the discovery and interpretation of rare variant associations using case studies wherein we recover mutation hotspots linked to amyotrophic lateral sclerosis (ALS) and reveal biologically relevant gene sets and cell-types associated with health-related traits in UK biobank sequencing data.

Files

ms_rvat_122025.pdf

Files (3.7 MB)

Name Size Download all
md5:875d4f2b8ede76d0a1c31c79072ec5bf
3.7 MB Preview Download

Additional details

Software