Crowdsourcing Online Handwriting Acquisition to Develop and Deploy a Unicode Character Classifier
Description
There are thousands of Unicode characters and hence it can be hard to visually find a particular one. For this reason, we aimed at developing a tool that allows to handwrite a character and receive a list of the most similar candidates to that input. This tool will be integrated in a math editor which handles more than 5,000 different Unicode characters. Since no public datasets were found to fit ur needs, we crowdsourced the acquisition of online handwritten data for training purposes. We developed a neural network combining convolutional layers with shape-based features to classify online handwritten Unicode characters. To make the model more robust to input variability, we used data augmentation in the form of affine transformations. We achieved a top-20 error rate of 12.64% on validation data and received positive feedback from users, thus validating that crowdsourcing is a proper method for online handwriting acquisition. Finally, we deployed the model wrapped in a JSON-based REST API and released a public demo using it. his way, we present the full development cycle of a Unicode character classifier.
Files
icfhr2018_final.pdf
Files
(909.2 kB)
Name | Size | Download all |
---|---|---|
md5:d7239396b328b0ac37c52a40afd14e03
|
909.2 kB | Preview Download |