There is a newer version of the record available.

Published February 20, 2023 | Version v1
Dataset Open

Reducing the Underrepresentation of Transnational Writers through Biographical Event Extraction

  • 1. University of Turin

Description

Wikidata represents an important source of literary knowledge, which is collaboratively created and curated by a large community of users. In this archive, it is possible to find hundreds of thousands pages about writers and their works. However, Wikidata is affected by the underrepresentation of Transnational authors, as recently demonstrated. Such an issue is present at different levels, since not only Transnational writers are less in number, but there are also fewer biographical information about them in their pages. In this paper we present an approach for reducing such form of underrepresentation by automatically extracting biographical information from Wikipedia through transformers and lexico-semantic patterns, and encoding it into Wikidata semantic model. Results show that our approach allows increasing the number of biographical triples on Wikidata for all writers, rebalancing at the same time the knowledge base in favour of Transnational writers.

Files

event_extraction_evaluation.csv

Files (25.1 MB)

Name Size Download all
md5:ddc4cf52c6db1604d9c782a8112c6471
24.6 MB Download
md5:bd1db95c9eba157c0d0ef5b01ad54034
405.1 kB Preview Download
md5:59a74683be92f756d3df961dcfb4b3e8
28.0 kB Download