Dzongkha Handwritten Digit Dataset

Tawmo; Prottay Kumar Adhikary; Pankaj Dadure; Partha Pakray

doi:10.5281/zenodo.6271560

Published February 25, 2022 | Version 1.0.0

Dataset Open

Dzongkha Handwritten Digit Dataset

1. National Institute of Technology Silchar

Dzongkha, the national language of Bhutan, has limited resources available for Natural Language Processing (NLP) tasks because the language is relatively understudied. However, there is no publicly available benchmark dataset for handwritten character identification in the Dzongkha digit script. The dataset contains 1000 images of handwritten Dzongkha digits that are captured using Google Jamboard in JPG format. The image data is assembled from a total of 100 indigenous and non-indigenous people of Bhutan irrespective of age, gender, educational background, etc. In the designed dataset, there are 10 different classes of Dzongkha digits which range from 0 to 9. The labels of these classes are: 0 (༠), 1 (༡), 2 (༢), 3 (༣), 4 (༤), 5 (༥), 6 (༦), 7 (༧), 8 (༨), 9 (༩).

Files

Dataset.zip

Files (73.1 MB)

Name	Size	Download all
Dataset.zip md5:6f6d82413a6de6bbdefa99c33502b3df	73.1 MB	Preview Download

603

Views

Downloads

Show more details

	All versions	This version
Views	603	597
Downloads	77	74
Data volume	6.4 GB	6.1 GB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: February 25, 2022
Modified: February 25, 2022

Dzongkha Handwritten Digit Dataset

Creators

Description

Files

Dataset.zip

Files (73.1 MB)