Sample randomly from a dfm object, from documents or features.

dfm_sample(x, size = ndoc(x), replace = FALSE, prob = NULL,
  margin = c("documents", "features"))

Arguments

x

the dfm object whose documents or features will be sampled

size

a positive number, the number of documents or features to select

replace

logical; should sampling be with replacement?

prob

a vector of probability weights for obtaining the elements of the vector being sampled.

margin

dimension (of a dfm) to sample: can be documents or features

Value

A dfm object with number of documents or features equal to size, drawn from the dfm x.

See also

sample

Examples

set.seed(10) myDfm <- dfm(data_corpus_inaugural[1:10]) head(myDfm)
#> Document-feature matrix of: 10 documents, 3,366 features (78.7% sparse). #> (showing first 6 documents and first 6 features) #> features #> docs fellow-citizens of the senate and house #> 1789-Washington 1 71 116 1 48 2 #> 1793-Washington 0 11 13 0 2 0 #> 1797-Adams 3 140 163 1 130 0 #> 1801-Jefferson 2 104 130 0 81 0 #> 1805-Jefferson 0 101 143 0 93 0 #> 1809-Madison 1 69 104 0 43 0
head(dfm_sample(myDfm))
#> Document-feature matrix of: 10 documents, 3,366 features (78.7% sparse). #> (showing first 6 documents and first 6 features) #> features #> docs fellow-citizens of the senate and house #> 1809-Madison 1 69 104 0 43 0 #> 1797-Adams 3 140 163 1 130 0 #> 1801-Jefferson 2 104 130 0 81 0 #> 1805-Jefferson 0 101 143 0 93 0 #> 1789-Washington 1 71 116 1 48 2 #> 1793-Washington 0 11 13 0 2 0
head(dfm_sample(myDfm, replace = TRUE))
#> Document-feature matrix of: 10 documents, 3,366 features (80.9% sparse). #> (showing first 6 documents and first 6 features) #> features #> docs fellow-citizens of the senate and house #> 1813-Madison 1 65 100 0 44 0 #> 1809-Madison 1 69 104 0 43 0 #> 1793-Washington 0 11 13 0 2 0 #> 1809-Madison 1 69 104 0 43 0 #> 1801-Jefferson 2 104 130 0 81 0 #> 1805-Jefferson 0 101 143 0 93 0
head(dfm_sample(myDfm, margin = "features"))
#> Document-feature matrix of: 10 documents, 10 features (81% sparse). #> (showing first 6 documents and first 6 features) #> features #> docs reciprocated savage accomplishing anxious representative vows #> 1789-Washington 0 0 0 0 0 0 #> 1793-Washington 0 0 0 0 0 0 #> 1797-Adams 0 0 0 0 0 0 #> 1801-Jefferson 0 0 0 1 1 0 #> 1805-Jefferson 0 0 0 0 0 0 #> 1809-Madison 0 1 0 0 0 0