Published May 1, 2020 | Version v1

Current Challenges in Web Corpus Building

  • 1. Lexical Computing

Description

In this paper we discuss some of the current challenges in web corpus building that we faced in the recent years when expanding the corpora in Sketch Engine. The purpose of the paper is to provide an overview and raise discussion on possible solutions, rather than bringing ready solutions to the readers. For every issue we try to assess its severity and briefly discuss possible mitigation options.

Files

2020.wac-1.0.pdf

Files (67.5 kB)

Name Size Download all
md5:70bbac451a02d8b2e3c91be0f3b01f9d
67.5 kB Preview Download

Additional details

Funding

European Commission
ELEXIS - European Lexicographic Infrastructure 731015