LanguageMachines/ticcltools: v0.6

Ko van der Sloot; Maarten van Gompel

doi:10.5281/zenodo.1283145

Published June 5, 2018 | Version v0.6

Software Open

LanguageMachines/ticcltools: v0.6

1. Radboud University
2. Centre of Language and Speech Technology, Radboud University Nijmegen

Intermediate release, with a lot of new code to handle N-grams Also a lot of refactoring is done, for more clear and maintainable code. This is work in progress still.

TICCL-unk:
- more extensive acronym detection
- fixed artifreq problems in 'clean' punctuated words
- added filters for 'unwanted' characters
- added a ligature filter to convert evil ligatures
- normalize all hyphens to a 'normal' one (-)
- use a better definition of punctuation (unicode character class is not good enough to decide)
TICCL-lexstat:
- the 'separator' symbol should get freq=0, so it isnt counted
- the clip value is added to the output filename
TICCL-indexer:
- indexer and indexerNT now produce the same output, using different strategies when a --foci files is used.
TICCL-LDcalc: major overhaul for n-grams
- added a ngram point column to the output (so NOT backward compatible!)
- produce a '.short' list for short word corrections
- produce a '.ambi' file with a list of n-grams related to short words
- prune a lot of ngrams from the output
TICCL-rank:
- output is sorted now
- honor the ngram-points from the new LDcalc. (so NOT backward compatible!)
TICCL-chain: new module to chain ranked files
TICCL-lexclean: -added a -x option for 'inverse' alphabet
TICCL-anahash:
- added a --list option to produce a list of words and anagram values
added metadata file: codemeta.json

Files

LanguageMachines/ticcltools-v0.6.zip

Files (142.1 MB)

Name	Size	Download all
LanguageMachines/ticcltools-v0.6.zip md5:6618336635efa1456b8a081df4df65f9	142.1 MB	Preview Download

Additional details

Is supplement to: https://github.com/LanguageMachines/ticcltools/tree/v0.6 (URL)

	All versions	This version
Views	327	43
Downloads	63	5
Data volume	8.3 GB	710.4 MB

LanguageMachines/ticcltools: v0.6

Authors/Creators

Description

Files

LanguageMachines/ticcltools-v0.6.zip

Files (142.1 MB)

Additional details

Related works