Published April 16, 2025 | Version v1
Dataset Open

Relate example for quantifying positive selection

Description

This repository contains a fully contained example of running Relate on CEU samples of the 1000 Genomes Project dataset. We include data a few Mb around the LCT gene and construct Relate trees and subsequently compute selection p-values using the module RelateSelection. 

The output is a p-value for every SNP in the dataset. These p-values are computed by first extracting the number of lineages remaining in the tree when the mutation was carried by two lineages. We then compute the probability of a mutation spreading to at least its present-day observed frequency. The p-value therefore quantifies to what extend the mutation has outcompeted other lineages in the tree. We can also condition on other time points in the tree, such as when around half the lineages remaining carry the focal mutation.

Running the scripts

Please download Relate binaries from https://myersgroup.github.io/relate/ and then run the script run.sh included in this repository.

Files

plot_manh.pdf

Files (87.7 MB)

Name Size Download all
md5:8610255bd14febd788cef7f5c2d22499
2.0 MB Preview Download
md5:1a7116b701a2243025b48552606dfbc1
85.7 MB Download

Additional details

Software