Published December 7, 2021
| Version v3.2.1
Software
Open
explosion/spaCy: v3.2.1: doc_cleaner component, new Matcher attributes, bug fixes and more
Creators
- Ines Montani1
- Matthew Honnibal1
- Matthew Honnibal1
- Sofie Van Landeghem2
- Adriane Boyd
- Henning Peters
- Paul O'Leary McCann3
- Maxim Samsonov
- Jim Geovedi
- Jim O'Regan
- György Orosz4
- Duygu Altinok5
- Søren Lind Kristiansen
- Roman6
- Explosion Bot5
- Leander Fiedler7
- Grégory Howard
- Wannaphong Phatthiyaphaibun8
- Yohei Tamura
- Sam Bozek
- murat
- Mark Amery
- Björn Böing9
- Pradeep Kumar Tippa
- Leif Uwe Vogelsang
- Bram Vanroy10
- Ramanan Balakrishnan11
- Vadim Mazaev
- GregDubbin
- 1. Founder @explosion
- 2. Explosion & OxyKodit
- 3. Cotonoha
- 4. LogMeIn, Meltwater
- 5. @explosion
- 6. @kouchtv
- 7. Nord/LB
- 8. @PyThaiNLP
- 9. @codecentric
- 10. @UGent
- 11. @Semantics3
Description
✨ New features and improvements
- NEW:
doc_cleanercomponent for removingdoc.tensor,doc._._trf_dataor otherDocattributes at the end of the pipeline to reduce size of output docs. - NEW:
ENT_IDandENT_KB_IDtoMatcherpattern attributes. - Support
kb_idfor entities in displaCy fromDocinput. - Add
Span.sentsproperty for spans spanning over more than one sentence. - Add
EntityRuler.removeto remove patterns byid. - Make the
Taggerneg_prefixconfigurable. - Use
Language.pipeinLanguage.evaluatefor more efficient processing. - Test suite updates: move regression tests into core test modules with pytest markers for issue numbers, extend tests for languages with alpha support.
- Fix issue #9638: Make
JsonlCorpuspath optional again. - Fix issue #9654: Fix
spancatfor empty docs and zero suggestions. - Fix issue #9658: Improve error message for incorrect
.jsonlpaths inEntityRuler. - Fix issue #9674: Fix language-specific factory handling in package CLI.
- Fix issue #9694: Convert labels to strings for README in package CLI.
- Fix issue #9697: Exclude strings from source vector checks.
- Fix issue #9701: Allow
Scorer.score_spansto handle predicted docs with missing annotation. - Fix issue #9722: Initialize
parserfrom reference parse rather than aligned example. - Fix issue #9764: Set annotations more efficiently in
taggerandmorphologizer.
- Various documentation updates:
init_tok2vecafter pretraining, batch contract for listeners. - New additions to the spaCy universe:
eng-spacysentiment: Sentiment analysis for English.- Applied Language Technology course: NLP for newcomers using spaCy and Stanza.
@adrianeboyd, @danieldk, @DuyguA, @honnibal, @ines, @ljvmiranda921, @narayanacharya6, @nrodnova, @Pantalaymon, @polm, @richardpaulhudson, @svlandeg, @thiippal, @Vishnunkumar
Files
explosion/spaCy-v3.2.1.zip
Files
(10.8 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:88d0df7c527b86ca4a3f5af3501adf2e
|
10.8 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/explosion/spaCy/tree/v3.2.1 (URL)