There is a newer version of the record available.

Published July 9, 2024 | Version v1
Dataset Open

Dataset of Vocabulary in Uzbek Primary Education

  • 1. Urgench State University
  • 2. ROR icon University of Primorska

Description

This dataset compiles words from two main sources: the "Explanatory Vocabulary of the Uzbek Language" (EDUL) and textbooks used across grades 1-4 in Uzbek primary schools (UPSC). The EDUL.txt file contains 29,190 words meticulously compiled by Urgench State University between 2019 and 2023. Additionally, the UPSC dataset includes 208,204 words extracted from primary school textbooks, sorted into separate files for each grade level. The dataset also identifies specific vocabulary words for each grade, supporting the enhancement of Uzbek language education and facilitating the development of natural language processing tools.

  • Grade 1 lemma vocabulary: 3,188 words (all new words)

  • Grade 2 lemma vocabulary: 4,630 words (including 1,997 new words)

  • Grade 3 lemma vocabulary: 5,700 words (including 1,578 new words)

  • Grade 4 lemma vocabulary: 6,397 words (including 1,356 new words)

All files are conveniently packaged into a single ZIP archive for easy access and distribution.

Files

EDUL.txt

Files (4.8 MB)

Name Size Download all
md5:1d23d3f63404c008dd44911d73a67bef
292.9 kB Preview Download
md5:99d68080df1fd80d0525be6a52a0ce5e
414.4 kB Preview Download
md5:8ed08e467caf6417c12992178b26cac0
125.3 kB Preview Download
md5:fb0d1c2b320bfb5fca31583f7698a513
449.2 kB Preview Download
md5:9428624e2b8c66b8215aeced6485df1e
389.8 kB Preview Download
md5:be58c6fc174e006770f51392162815bf
705.3 kB Preview Download
md5:25ca3e31e11ca7439ec6ac82fb5cf3a9
685.6 kB Preview Download
md5:2540e0a26a922252ae6fe2e90cd1f306
905.9 kB Preview Download
md5:da8024dccd734b68229abe5a4df890a8
868.5 kB Preview Download

Additional details

Dates

Available
2024-07