Inputlog Copy Task Corpus: Exploring and defining typing skills
- 1. University of Antwerp
Description
Context
One of the components that is included in the keystroke logging program Inputlog (https://www.inputlog.net) is the Copy Task component. It consists of a multi-layered set of tasks that measure a person's typing skill:
| Tapping task | press the ‘d’ and ‘k’ key alternatively during 15 s |
| Sentence | copy a sentence during 30 s |
| Word combination 1 | copy a combination of three words seven times |
| Word combination 2 | copy a combination of three words seven times |
| Word combination 3 | copy a combination of three words seven times |
| Word combination 4 | copy a combination of three words seven times |
| Consonant groups | copy four blocks of six consonants once |
The task is currently made available in twelve languages.
For more information: https://doi.org/10.5334/jors.234
Interactive Dashboard
Visit the webpage with an interactive dashboard to explore, filter, and download the +5K copy task corpus.
website: https://www.inputlog.net/copy-task/
dashboard: https://inputlog-analysis.uantwerpen.be/expert
Corpus
We are happy to make a multilingual corpus available (open access) that currently consists of more than 5000 copy tasks.
- The + 5K corpus is carefully cleaned and fully anonymized.
- The Shiny interface allows users to filter the corpus based on about 10 variables.
- The selection can be downloaded in different formats and levels of aggregation (from raw idfx to synthesized analysis).
- The selection can be explored using different interactive graph visualizations.
- Researchers can upload their own corpus (or single copy task file) and compare it to the (selected) corpus.
- An extra webpage is designed for laypersons wanting to take a copy task to test their typing skills. They get dashboard feedback in a user-friendly and attractive way and can compare their performance with (age-related) participants in the corpus. (Specially designed to further expand the corpus).
Facts and Figures
Some facts and figures about the corpus' composition:
Languages:
- Dutch 3130 files
- English 1163 files
- German 281 files
- French 201 files
- Other 378 file
Gender
- Female: 3495 files
- Male: 1276 files
- X or missing 382 files
Age
- 15- 439 files
- 16-20 1591 files
- 21-25 2427 files
- 26-35 478 files
- 36-45 126 files
- 46+ 230 files
A subset of the total corpus has been uploaded here. The subset contains a dataset of about 500 tests (English | 21-25-year-olds).
Notes
Files
sub-dataset_EN_21-25year.zip
Files
(336.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:096cdbd402d4f9ba784a3f335f9fbc1a
|
336.6 kB | Preview Download |
Additional details
References
- https://www.inputlog.net/copy-task/
- Van Waes, L., Leijten, M., Roeser, J. Olive, T., & Grabowski, J. (2021). Measuring and assessing typing skills in writing research. Journal of Writing Research, 13(1), 107-153. https://doi.org/10.17239/jowr-2021.13.01.04 | PDF
- an Waes, L., Leijten, M., Pauwaert, T., & Van Horenbeeck, E. (2019). A multilingual copy task: Measuring typing and motor skills in Writing with Inputlog. Journal of Open Research Software, 7(1:30), 1-8. https://doi.org/10.5334/jors.234