Dataset Open Access

HWRT database of handwritten symbols

Thoma, Martin

The HWRT database of handwritten symbols contains on-line data of handwritten symbols such as all alphanumeric characters, arrows, greek characters and mathematical symbols like the integral symbol.

The database can be downloaded in form of bzip2-compressed tar files. Each tar file contains:

  • symbols.csv: A CSV file with the rows symbol_id, latex, training_samples, test_samples. The symbol id is an integer, the row latex contains the latex code of the symbol, the rows training_samples and test_samples contain integers with the number of labeled data.
  • train-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.
  • test-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.

All CSV files use ";" as delimiter and "'" as quotechar. The data is given in YAML format as a list of lists of dictinaries. Each dictionary has the keys "x", "y" and "time". (x,y) are coordinates and time is the UNIX time.

 

About 90% of the data was made available by Daniel Kirsch via github.com/kirel/detexify-data. Thank you very much, Daniel!

Files (140.8 MB)
Name Size
2015-01-28-data.tar
md5:2bf1d089ce65c0a39e57064516f1bd1c
140.8 MB Download
363
26
views
downloads
All versions This version
Views 363363
Downloads 2626
Data volume 3.7 GB3.7 GB
Unique views 351351
Unique downloads 2424

Share

Cite as