dfm-class.RdThe dfm class of object is a type of Matrix-class object with
additional slots, described below. quanteda uses two subclasses of the
dfm class, depending on whether the object can be represented by a
sparse matrix, in which case it is a dfm class object, or if dense,
then a dfmDense object. See Details.
# S4 method for dfm t(x) # S4 method for dfm colSums(x, na.rm = FALSE, dims = 1, ...) # S4 method for dfm rowSums(x, na.rm = FALSE, dims = 1, ...) # S4 method for dfm colMeans(x, na.rm = FALSE, dims = 1, ...) # S4 method for dfm rowMeans(x, na.rm = FALSE, dims = 1, ...) # S4 method for dfm,numeric Arith(e1, e2) # S4 method for numeric,dfm Arith(e1, e2) # S4 method for dfm,index,index,missing [(x, i, j, ..., drop = TRUE) # S4 method for dfm,index,index,logical [(x, i, j, ..., drop = TRUE) # S4 method for dfm,missing,missing,missing [(x, i, j, ..., drop = TRUE) # S4 method for dfm,missing,missing,logical [(x, i, j, ..., drop = TRUE) # S4 method for dfm,index,missing,missing [(x, i, j, ..., drop = TRUE) # S4 method for dfm,index,missing,logical [(x, i, j, ..., drop = TRUE) # S4 method for dfm,missing,index,missing [(x, i, j, ..., drop = TRUE) # S4 method for dfm,missing,index,logical [(x, i, j, ..., drop = TRUE)
| x | the dfm object |
|---|---|
| na.rm | if |
| dims | ignored |
| ... | additional arguments not used here |
| e1 | first quantity in "+" operation for dfm |
| e2 | second quantity in "+" operation for dfm |
| i | index for documents |
| j | index for features |
| drop | always set to |
The dfm class is a virtual class that will contain
dgCMatrix-class.
settingssettings that govern corpus handling and subsequent downstream
operations, including the settings used to clean and tokenize the texts,
and to create the dfm. See settings.
weightingthe feature weighting applied to the dfm. Default is
"frequency", indicating that the values in the cells of the dfm are
simple feature counts. To change this, use the dfm_weight
method.
smootha smoothing parameter, defaults to zero. Can be changed using
the dfm_smooth method.
DimnamesThese are inherited from Matrix-class but are
named docs and features respectively.
# dfm subsetting x <- dfm(tokens(c("this contains lots of stopwords", "no if, and, or but about it: lots", "and a third document is it"), remove_punct = TRUE)) x[1:2, ]#> Document-feature matrix of: 2 documents, 16 features (59.4% sparse). #> 2 x 16 sparse Matrix of class "dfm" #> features #> docs this contains lots of stopwords no if and or but about it a third #> text1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 #> text2 0 0 1 0 0 1 1 1 1 1 1 1 0 0 #> features #> docs document is #> text1 0 0 #> text2 0 0x[1:2, 1:5]#> Document-feature matrix of: 2 documents, 5 features (40.0% sparse). #> 2 x 5 sparse Matrix of class "dfm" #> features #> docs this contains lots of stopwords #> text1 1 1 1 1 1 #> text2 0 0 1 0 0