Dataset Open Access

HWRT database of handwritten symbols

Thoma, Martin


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Thoma, Martin</dc:creator>
  <dc:date>2015-01-28</dc:date>
  <dc:description>The HWRT database of handwritten symbols contains on-line data of handwritten symbols such as all alphanumeric characters, arrows, greek characters and mathematical symbols like the integral symbol.

The database can be downloaded in form of bzip2-compressed tar files. Each tar file contains:


	symbols.csv: A CSV file with the rows symbol_id, latex, training_samples, test_samples. The symbol id is an integer, the row latex contains the latex code of the symbol, the rows training_samples and test_samples contain integers with the number of labeled data.
	train-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.
	test-data.csv: A CSV file with the rows symbol_id, user_id, user_agent and data.


All CSV files use ";" as delimiter and "'" as quotechar. The data is given in YAML format as a list of lists of dictinaries. Each dictionary has the keys "x", "y" and "time". (x,y) are coordinates and time is the UNIX time.

 

About 90% of the data was made available by Daniel Kirsch via github.com/kirel/detexify-data. Thank you very much, Daniel!</dc:description>
  <dc:identifier>https://zenodo.org/record/50022</dc:identifier>
  <dc:identifier>10.5281/zenodo.50022</dc:identifier>
  <dc:identifier>oai:zenodo.org:50022</dc:identifier>
  <dc:relation>url:http://www.martin-thoma.de/write-math/data/</dc:relation>
  <dc:relation>url:https://zenodo.org/record/259444</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/computer-vision</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>http://www.opendatacommons.org/licenses/odbl/1.0/</dc:rights>
  <dc:subject>symbol</dc:subject>
  <dc:subject>LaTeX</dc:subject>
  <dc:subject>mathematics</dc:subject>
  <dc:subject>pattern recognition</dc:subject>
  <dc:subject>machine learning</dc:subject>
  <dc:subject>on-line recognition</dc:subject>
  <dc:title>HWRT database of handwritten symbols</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
1,151
203
views
downloads
All versions This version
Views 1,1511,153
Downloads 203203
Data volume 28.6 GB28.6 GB
Unique views 1,0951,097
Unique downloads 164164

Share

Cite as