Published October 25, 2018 | Version v0.7.1
Software Open

bootphon/wordseg: wordseg-0.7.1

  • 1. INRIA, @bootphon

Description

  • New evaluation metrics in wordseg-eval:

    • adjusted rand index:

      This requires the prepared text to be computed (whereas the other metrics only rely on segmented and gold texts), so it is implemented as an option --rand-index <prep-file> in wordseg-eval.

      An easiest implementation would have been to change the specifications of wordseg-eval to take the prepared text instead of the gold one, but we prefered the optional --rand-index for backward compatibility.

    • segmentation errors summary:

      Detailed report of segmentation errors, may be undersegmentation, oversegmentation or missegmentation. Implemented as the option --summary <json-file> in wordseg-eval.

  • In wordseg-dibs, renamed baseline algorithm to gold, so as to avoid confusion with wordseg-baseline. See #48.

  • tools/wordseg-qsub.sh renamed tools/wordseg-sge.sh and new tools/wordseg-{slurm, bash}.sh to submit jobs on SLURM based clusters and locally using bash.

  • Bugfix in tools/wordseg-{sge, slurm, bash}.sh: wordseg-dibs is correctly handled (was a problem with the train file). Those tools now included full pipeline, including statistics and text preparation.

Files

bootphon/wordseg-v0.7.1.zip

Files (273.7 kB)

Name Size Download all
md5:4b010f189daabeade2b7249c8970e015
273.7 kB Preview Download

Additional details

Related works