Published November 21, 2019 | Version 1.0
Dataset Open

French Word Sense Disambiguation with Princeton WordNet Identifiers

Creators

  • 1. Univ. Grenoble Alpes

Description

This is a dataset for the Word Sense Disambiguation of French using Princeton WordNet identifiers. It contains two training corpora : the SemCor and the WordNet Gloss Corpus, both automatically translated from their original English version, and with sense tags automatically aligned. It contains also a test corpus : the task 12 of SemEval 2013, originally sense annotated with BabelNet identifiers, converted into Princeton WordNet 3.0.

Files

semcor.fr.xml

Files (271.6 MB)

Name Size Download all
md5:87f0a390ccfda959de51063aad08082d
86.0 MB Preview Download
md5:17530c037c658dd56eccb00bd8e66b7b
834.6 kB Preview Download
md5:8a8d772b920667ab375c2460c519c9f0
184.7 MB Preview Download