Creating a Corpus and Chained Bigrams for Spanish Keyboard Development and Evaluation
Authors/Creators
Description
The process to create a corpus suitable for evaluating computer keyboard layouts optimised for typing Spanish. After sourcing, sampling and cleaning suitable texts, the texts are processed to extract bigrams, which are then used to create sample input texts of a desired length. These texts have a character distribution, and letter sequence, closely matching Spanish, even though they look random. The resulting texts are excellent for evaluating keyboard layouts. Corpus analysis is included.
p { margin-bottom: 0.25cm; line-height: 115%; orphans: 0; widows: 0; background: transparent; page-break-before: auto }p.western { font-family: "Libertinus Math"; font-size: 12pt; font-weight: normal }a:visited { color: #800000; so-language: zxx; text-decoration: underline }a:link { color: #000080; so-language: zxx; text-decoration: underline }