Construct a compressed version of a corpus.
corpuszip(x, docnames = NULL, docvars = NULL, text_field = "text", metacorpus = NULL, ...)
data.frame
indicating the variable to be read in as text, which must be a character vector.
All other variables in the data.frame will be imported as docvars. This argument
is only used for data.frame
objects (including those created by readtext).summary.corpus
are:
source
a description of the source of the texts, used for
referencing;
citation
information on how to cite the corpus; and
notes
any additional information about who created the text, warnings,
to do lists, etc.
# create a compressed corpus from texts corpuszip(data_char_inaugural)#> Corpus consisting of NULL document (compressed 65.8%).# create a compressed corpus from texts and assign meta-data and document variables cop <- corpus(data_char_ukimmig2010, docvars = data.frame(party = names(data_char_ukimmig2010))) cop_zip <- corpuszip(data_char_ukimmig2010, docvars = data.frame(party = names(data_char_ukimmig2010))) object.size(cop)#> 45136 bytesobject.size(cop_zip)#> 21424 bytes