Projection test for query interval overlap.
bed_projection(x, y, genome, by_chrom = FALSE)
x | |
---|---|
y | |
genome | |
by_chrom | compute test per chromosome |
tbl_interval()
with the following columns:
chrom
the name of chromosome tested if by_chrom = TRUE
,
otherwise has a value of whole_genome
p.value
p-value from a binomial test. p-values > 0.5
are converted to 1 - p-value
and lower_tail
is FALSE
obs_exp_ratio
ratio of observed to expected overlap frequency
lower_tail
TRUE
indicates the observed overlaps are in the lower tail
of the distribution (e.g., less overlap than expected). FALSE
indicates
that the observed overlaps are in the upper tail of the distribution (e.g.,
more overlap than expected)
Interval statistics can be used in combination with
dplyr::group_by()
and dplyr::do()
to calculate
statistics for subsets of data. See vignette('interval-stats')
for
examples.
http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002529
Other interval statistics: bed_absdist
,
bed_fisher
, bed_jaccard
,
bed_reldist
genome <- read_genome(valr_example('hg19.chrom.sizes.gz')) x <- bed_random(genome, seed = 1010486) y <- bed_random(genome, seed = 9203911) bed_projection(x, y, genome)#> # A tibble: 1 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 whole_genome 0.0714 1.00 FALSEbed_projection(x, y, genome, by_chrom = TRUE)#> # A tibble: 25 x 4 #> chrom p.value obs_exp_ratio lower_tail #> <chr> <dbl> <dbl> <chr> #> 1 chr1 0.302 1.00 FALSE #> 2 chr10 0.305 1.00 FALSE #> 3 chr11 0.306 0.996 TRUE #> 4 chr12 0.0314 1.01 FALSE #> 5 chr13 0.449 1.00 FALSE #> 6 chr14 0.125 1.01 FALSE #> 7 chr15 0.317 1.00 FALSE #> 8 chr16 0.248 1.01 FALSE #> 9 chr17 0.381 1.00 FALSE #> 10 chr18 0.190 0.991 TRUE #> # ... with 15 more rows