Software Open Access

Pie Model for Classical French -- Part-of-Speech and Morphology (CATTEX2009-max)

Camps, Jean-Baptiste; Gabay, Simon; Clérice, Thibault; Cafiero, Florian


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Camps, Jean-Baptiste</dc:creator>
  <dc:creator>Gabay, Simon</dc:creator>
  <dc:creator>Clérice, Thibault</dc:creator>
  <dc:creator>Cafiero, Florian</dc:creator>
  <dc:date>2020-03-04</dc:date>
  <dc:description>Pie Model for Classical French, for Part-of-Speech and Morphology tags (CATTEX2009-max).

Trained on a corpus of Classical French Theatre.

More information:

- corpus: Camps, Jean-Baptiste, &amp; Cafiero, Florian. (2019). Stylometric Analysis of Classical French Theatre [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3353421.

- F. Cafiero and J.B. Camps, Why Molière most likely did write his plays, Science Advances, 27 Nov 2019: Vol. 5, no. 11, eaax5489, DOI: 10.1126/sciadv.aax5489, https://advances.sciencemag.org/content/5/11/eaax5489/.

- J.B. Camps, S. Gabay, Th. Clérice and F. Cafiero, Corpus and Models for Lemmatisation and POS-tagging of Classical French Theatre, to be published.

Current results on test data:

::: Evaluation report for task: pos :::

all:
  accuracy: 0.9701
  precision: 0.92
  recall: 0.8964
  support: 4181
ambiguous-tokens:
  accuracy: 0.9229
  precision: 0.9203
  recall: 0.9175
  support: 934
unknown-tokens:
  accuracy: 0.8165
  precision: 0.4798
  recall: 0.4904
  support: 218

::: Evaluation report for task: MODE :::

all:
  accuracy: 0.9818
  precision: 0.8765
  recall: 0.8517
  support: 4181
ambiguous-tokens:
  accuracy: 0.84
  precision: 0.8483
  recall: 0.7612
  support: 125
unknown-tokens:
  accuracy: 0.8211
  precision: 0.7256
  recall: 0.658
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| MODE=con    | 0.81      | 0.94   | 0.87     | 18      |
| MODE=imp    | 0.83      | 0.78   | 0.80     | 68      |
| MODE=ind    | 0.91      | 0.92   | 0.92     | 341     |
| MODE=sub    | 0.84      | 0.62   | 0.71     | 60      |
| MODE=x      | 0.99      | 1.00   | 1.00     | 3694    |
| avg / total | 0.88      | 0.85   | 0.86     | 4181    |


::: Evaluation report for task: TEMPS :::

all:
  accuracy: 0.9871
  precision: 0.9305
  recall: 0.9259
  support: 4181
ambiguous-tokens:
  accuracy: 0.9135
  precision: 0.623
  recall: 0.6072
  support: 104
unknown-tokens:
  accuracy: 0.8394
  precision: 0.8693
  recall: 0.5399
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| TEMPS=fut   | 0.98      | 0.85   | 0.91     | 47      |
| TEMPS=ipf   | 0.93      | 0.88   | 0.90     | 16      |
| TEMPS=psp   | 0.80      | 1.00   | 0.89     | 4       |
| TEMPS=pst   | 0.95      | 0.91   | 0.93     | 334     |
| TEMPS=x     | 0.99      | 1.00   | 0.99     | 3780    |
| avg / total | 0.93      | 0.93   | 0.92     | 4181    |


::: Evaluation report for task: PERS :::

all:
  accuracy: 0.9859
  precision: 0.9821
  recall: 0.9668
  support: 4181
ambiguous-tokens:
  accuracy: 0.942
  precision: 0.9178
  recall: 0.9188
  support: 362
unknown-tokens:
  accuracy: 0.8394
  precision: 0.9426
  recall: 0.6344
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| PERS.=1     | 0.98      | 0.96   | 0.97     | 429     |
| PERS.=2     | 0.97      | 0.97   | 0.97     | 258     |
| PERS.=3     | 0.99      | 0.94   | 0.96     | 410     |
| PERS.=x     | 0.99      | 1.00   | 0.99     | 3084    |
| avg / total | 0.98      | 0.97   | 0.97     | 4181    |


::: Evaluation report for task: NOMB :::

all:
  accuracy: 0.9797
  precision: 0.9809
  recall: 0.9733
  support: 4181
ambiguous-tokens:
  accuracy: 0.7865
  precision: 0.7511
  recall: 0.6884
  support: 192
unknown-tokens:
  accuracy: 0.8349
  precision: 0.7918
  recall: 0.7729
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| NOMB.=p     | 0.98      | 0.95   | 0.97     | 545     |
| NOMB.=s     | 0.98      | 0.98   | 0.98     | 1831    |
| NOMB.=x     | 0.98      | 0.99   | 0.98     | 1805    |
| avg / total | 0.98      | 0.97   | 0.98     | 4181    |

::: Evaluation report for task: GENRE :::

all:
  accuracy: 0.9749
  precision: 0.969
  recall: 0.9685
  support: 4181
ambiguous-tokens:
  accuracy: 0.9118
  precision: 0.9063
  recall: 0.9208
  support: 465
unknown-tokens:
  accuracy: 0.7385
  precision: 0.7097
  recall: 0.6977
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| GENRE=f     | 0.92      | 0.94   | 0.93     | 387     |
| GENRE=m     | 0.97      | 0.94   | 0.96     | 940     |
| GENRE=n     | 1.00      | 1.00   | 1.00     | 45      |
| GENRE=x     | 0.98      | 0.99   | 0.99     | 2809    |
| avg / total | 0.97      | 0.97   | 0.97     | 4181    |


::: Evaluation report for task: CAS :::

all:
  accuracy: 0.9983
  precision: 0.9957
  recall: 0.9901
  support: 4181
ambiguous-tokens:
  accuracy: 0.9648
  precision: 0.9796
  recall: 0.9692
  support: 199
unknown-tokens:
  accuracy: 1.0
  precision: 1.0
  recall: 1.0
  support: 218


::: Classification report :::

| target      | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| CAS=i       | 1.00      | 1.00   | 1.00     | 46      |
| CAS=n       | 1.00      | 1.00   | 1.00     | 190     |
| CAS=r       | 0.98      | 0.96   | 0.97     | 128     |
| CAS=x       | 1.00      | 1.00   | 1.00     | 3817    |
| avg / total | 1.00      | 0.99   | 0.99     | 4181    |

 </dc:description>
  <dc:identifier>https://zenodo.org/record/3701320</dc:identifier>
  <dc:identifier>10.5281/zenodo.3701320</dc:identifier>
  <dc:identifier>oai:zenodo.org:3701320</dc:identifier>
  <dc:language>fra</dc:language>
  <dc:relation>doi:10.5281/zenodo.3243486.</dc:relation>
  <dc:relation>doi:10.5281/zenodo.3696675</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/natural-language-processing</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>Natural language processing</dc:subject>
  <dc:subject>Part-of-speech tagging</dc:subject>
  <dc:subject>Classical French</dc:subject>
  <dc:subject>French Language</dc:subject>
  <dc:subject>Deep Learning</dc:subject>
  <dc:title>Pie Model for Classical French -- Part-of-Speech and Morphology (CATTEX2009-max)</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>software</dc:type>
</oai_dc:dc>
20
393
views
downloads
All versions This version
Views 209
Downloads 393381
Data volume 9.3 GB9.0 GB
Unique views 157
Unique downloads 5950

Share

Cite as