Cutoffs dividing the range of a distribution into continuous intervals with equal probabilities.
You can take a sample of numbers on divide them into N equally-sized groups. Let’s use these 12 numbers as an example:
dat <- data.frame(
x = c(1, 1, 2, 2, 3, 4, 4, 5, 7, 7, 7, 10)
)
The quantile()
function gives you the cutoffs for each quantile from the data. Set the argument probs
to seq(0, 1, 1/N)
for any N-tile.
The function dplyr::ntile()
tells you which quantile each number is in for any N-tile.
# median
quantile(dat$x, probs = seq(0, 1, 1/2))
## 0% 50% 100%
## 1 4 10
dat$`2-tile` <- dplyr::ntile(dat$x, 2)
# tertile
quantile(dat$x, probs = seq(0, 1, 1/3))
## 0% 33.33333% 66.66667% 100%
## 1.000000 2.666667 5.666667 10.000000
dat$`3-tile` <- dplyr::ntile(dat$x, 3)
# quartile
quantile(dat$x, probs = seq(0, 1, 1/4))
## 0% 25% 50% 75% 100%
## 1 2 4 7 10
dat$`4-tile` <- dplyr::ntile(dat$x, 4)
x | 2-tile | 3-tile | 4-tile |
---|---|---|---|
1 | 1 | 1 | 1 |
1 | 1 | 1 | 1 |
2 | 1 | 1 | 1 |
2 | 1 | 1 | 2 |
3 | 1 | 2 | 2 |
4 | 1 | 2 | 2 |
4 | 2 | 2 | 3 |
5 | 2 | 2 | 3 |
7 | 2 | 3 | 3 |
7 | 2 | 3 | 4 |
7 | 2 | 3 | 4 |
10 | 2 | 3 | 4 |
See Q_Q plots.