These functions provide quanteda methods for spacyr objects, and also extend spacy_parse and spacy_tokenize to work directly with corpus objects.
# S3 method for spacyr_parsed docnames(x) # S3 method for spacyr_parsed ndoc(x) # S3 method for spacyr_parsed ntoken(x, ...) # S3 method for spacyr_parsed ntype(x, ...) # S3 method for spacyr_parsed nsentence(x, ...)
x | an object returned by |
---|---|
... | not used for these functions |
spacy_parse(x, ...)
and spacy_tokenize(x, ...)
work directly on
quanteda corpus objects.
docnames()
returns the document names
ndoc()
returns the number of documents
ntoken()
returns the number of tokens by document
ntype()
returns the number of types (unique tokens) by document
nsentence()
returns the number of sentences by document
if (FALSE) { library("spacyr") spacy_initialize() corp <- corpus(c(doc1 = "And now, now, now for something completely different.", doc2 = "Jack and Jill are children.")) spacy_tokenize(corp) (parsed <- spacy_parse(corp)) ntype(parsed) ntoken(parsed) ndoc(parsed) docnames(parsed) }