Published November 21, 2019
| Version v2.2.3
Software
Open
explosion/spaCy: v2.2.3: Tokenizer.explain, Korean base support, dependency scores per label and bug fixes
Creators
- Ines Montani1
- Matthew Honnibal1
- Matthew Honnibal1
- Sofie Van Landeghem2
- Henning Peters
- Maxim Samsonov
- adrianeboyd
- Jim Geovedi
- Jim Regan
- György Orosz3
- Paul O'Leary McCann
- Søren Lind Kristiansen
- Duygu Altinok4
- Roman5
- Grégory Howard
- Wannaphong Phatthiyaphaibun6
- Sam Bozek
- Explosion Bot7
- Björn Böing
- Mark Amery
- Leif Uwe Vogelsang
- Pradeep Kumar Tippa
- jeannefukumaru
- GregDubbin
- Vadim Mazaev
- Ramanan Balakrishnan8
- Jens Dahl Møllerhøj9
- wbwseeker
- Magnus Burton
- Avadh Patel10
- 1. Founder @explosion
- 2. OxyKodit
- 3. LogMeIn, Meltwater
- 4. German Autolabs
- 5. @kouchtv
- 6. @PyThaiNLP
- 7. @explosion
- 8. @Semantics3
- 9. mollerhoj
- 10. SUNY Binghamton - Computer Science
Description
✨ New features and improvements
- NEW:
Tokenizer.explain
method to see which rule or pattern was matched.tok_exp = nlp.tokenizer.explain("(don't)") assert [t[0] for t in tok_exp] == ["PREFIX", "SPECIAL-1", "SPECIAL-2", "SUFFIX"] assert [t[1] for t in tok_exp] == ["(", "do", "n't", ")"]
- NEW: Official Python 3.8 wheels for spaCy and its dependencies.
- Base language support for Korean.
- Add
Scorer.las_per_type
(labelled depdencency scores per label). - Rework Chinese language initialization and tokenization
- Improve language data for Luxembourgish.
- Fix issue #4573, #4645: Improve tokenizer usage docs.
- Fix issue #4575: Add error in
debug-data
if no dev docs are available. - Fix issue #4582: Make
as_tuples=True
inLanguage.pipe
work with multiprocessing. - Fix issue #4590: Correctly call
on_match
inDependencyMatcher
. - Fix issue #4593: Build wheels for Python 3.8.
- Fix issue #4604: Fix realloc in
Retokenizer.split
. - Fix issue #4656: Fix
conllu2json
converter when-n
> 1. - Fix issue #4662: Fix
Language.evaluate
for components without.pipe
method. - Fix issue #4670: Ensure
EntityRuler
is deserialized correctly from disk. - Fix issue #4680: Raise error if non-string labels are added to
Tagger
orTextCategorizer
. - Fix issue #4691: Make
Vectors.find
return keys in correct order.
- Fix various typos and inconsistencies.
Thanks to @yash1994, @walterhenry, @prilopes, @f11r, @questoph, @erip, @richardpaulhudson and @GuiGel for the pull requests and contributions.
Files
explosion/spaCy-v2.2.3.zip
Files
(5.8 MB)
Name | Size | Download all |
---|---|---|
md5:4a60c10af9150ff3a0da33eb863fe69f
|
5.8 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/explosion/spaCy/tree/v2.2.3 (URL)