Software Open Access
Julien Barnier; Florian Privé; Kenneth Benoit
rainette2have been renamed to
0, which means that no merging is done between segments by default. Results could then be different from previous package versions when
min_uc_sizewas not specified.
min_segment_sizewas not handled correctly in the previous versions regarding the segment sources, as segments from different documents could be merged together. This should now be fixed.
clusters_by_doc_tablewhich gives the number of segments of each cluster for each document.
docs_by_cluster_tablewhich gives, for each cluster, the number of documents with at least one segment in this cluster.
split_segmentsshould now be about 4 times faster.
rainetteis called with
min_segment_size> 0, a
doc_idargument must be given which is the name of a
dtmdocvar identifying the segments source. If the corpus has been produced by
split_segments, the added
segment_sourcedocvar is used by default.
rainette_exploris called on a
rainette2_exploris launched automatically.