A STATISTICAL INDEX CALCULATED USING THE TF-IDF FOR TEXTS IN THE UZBEK LANGUAGE CORPUS

doi:10.5281/zenodo.7440059

"Science and innovation" international scientific journal

Published December 15, 2022 | Version v1

Journal article Open

A STATISTICAL INDEX CALCULATED USING THE TF-IDF FOR TEXTS IN THE UZBEK LANGUAGE CORPUS

B.Elov Z.Xusainova N.Xudayberganov

One of the most common methods of processing textual data is TF-IDF. Google's search engine has been using the TF-IDF method for ranking content relevant to user queries for many years. According to the results of the conducted research, it was determined that the Google system paid more attention to the frequency of terms than to the calculation of keywords. The value determined by the TF-IDF method represents the relevance of the keyword in the language corpus. Using the TF-IDF method, a digital vector corresponding to corpus documents is generated. This numeric vector is a measure used in the fields of information retrieval (IR) and machine learning (ML) to represent the importance of string representations (words, phrases, lemmas, etc.) to a document. In this article, we will consider the process of sorting documents in the Uzbek language corpus using the TF-IDF method according to the keyword.

Files

B-363.pdf

Files (1.0 MB)

Name	Size	Download all
B-363.pdf md5:e3fc047281db1002be7ee814645903ee	1.0 MB	Preview Download

Views

Downloads

Show more details

	All versions	This version
Views	73	72
Downloads	51	51
Data volume	53.1 MB	53.1 MB

More info on how stats are collected....

DOI

Resource type

Journal article

Publisher

Zenodo

Languages

Uzbek

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: December 15, 2022
Modified: July 15, 2024

A STATISTICAL INDEX CALCULATED USING THE TF-IDF FOR TEXTS IN THE UZBEK LANGUAGE CORPUS

Creators

Description

Files

B-363.pdf

Files (1.0 MB)