Published September 2, 2020 | Version v1
Dataset Open

The influence of human genetic variation on Epstein-Barr virus sequence diversity

Authors/Creators

Description

This project is the first attempt to apply a "genome-to-genome" approach to investigate the impact of the host genetic pressure on the genome of a member of the Herpesviridae family. Namely, 285 pairs of human and EBV genomes were sequenced and multiple GWASes between human and EBV variations were performed. This repository contains the results of downstream analysis on the pathogen-side. The variant calling data was produced from read alignment using BWA mem using GATK HC, SNVer, VarScan2, BCFtools, freebayes and the intersection of the sets of variation from the first three.

  • The "covstats" files contains statistics about the read alignment.
  • The tarball SHCS_EBV_variant_call.tar.gz contains all compressed VCF files.
  • The tarball SHCS_EBV_variant_matrices_stats.tar.gz contains all matrices used as traits in the GWASes, as well as a variety of statistics.

The pipeline used to generate this data is publically available here: https://gitlab.com/ezlab/vir_var_calling/

Files

Files (151.6 MB)

Name Size Download all
md5:067e1daf8b7ba3b45f1ce0e771095653
22.6 kB Download
md5:e5321fe784d841c70a6ff9e3861e3eeb
22.6 kB Download
md5:315af3b134fb42af5fe397a7236288b1
130.7 MB Download
md5:2a4f2a2b79e88c7fc155010998aaaf23
20.9 MB Download