There is a newer version of the record available.

Published October 19, 2021 | Version v0.4.0
Software Open

adbar/simplemma: simplemma-0.4.0

Authors/Creators

  • 1. Berlin-Brg. Academy of Sciences (BBAW)

Description

  • new languages: Armenian, Greek, Macedonian, Norwegian (Bokmål), and Polish
  • language data reviewed for: Dutch, Finnish, German, Hungarian, Latin, Russian, and Swedish
  • Urdu removed of language list due to issues with the data
  • add support for Python 3.10 and drop support for Python 3.4
  • improved decomposition and tokenization algorithms

Files

adbar/simplemma-v0.4.0.zip

Files (63.7 MB)

Name Size Download all
md5:1f6c9c5cbf6389ce8b8c27a702c8063d
63.7 MB Preview Download

Additional details

Related works