Natas: A library for normalizing historical English

doi:10.5281/zenodo.3451858

Published September 20, 2019 | Version 1.0.2

Software Open

Natas: A library for normalizing historical English

Hämäläinen, Mika¹

1. University of Helsinki

Python 3 library for processing historical English.

1. Cite

If you use the library, please cite one of the following publications depending on whether you used it for normalization or OCR correction.

1.1 Normalization

Mika Hämäläinen, Tanja Säily, Jack Rueter, Jörg Tiedemann, and Eetu Mäkelä. 2019. Revisiting NMT for Normalization of Early English Letters. In Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature.

1.2 OCR correction

Mika Hämäläinen, and Simon Hengchen. 2019. From the Paft to the Fiiture: a Fully Automatic NMT and Word Embeddings Method for OCR Post-Correction. In the Proceedings of Recent Advances in Natural Language Processing.

Files

mikahama/natas-1.0.2.zip

Files (134.0 MB)

Name	Size	Download all
mikahama/natas-1.0.2.zip md5:fc56b802b37bc0acd3fb888057785019	134.0 MB	Preview Download

Additional details

Is supplement to: https://github.com/mikahama/natas/tree/1.0.2 (URL)

	All versions	This version
Views	167	23
Downloads	21	1
Data volume	2.8 GB	134.0 MB

Natas: A library for normalizing historical English

Creators

Description

Files

mikahama/natas-1.0.2.zip

Files (134.0 MB)

Additional details

Related works