Published December 27, 2022 | Version 0.0.1
Dataset Open

Persian Speech to Test dataset

  • 1. Master Student in SRU | ML Engineer in System Groupe

Description

The Persian Speech to Text dataset is a collection of audio files and their corresponding transcripts, provided in CSV file format. The dataset is intended for use in training machine learning models for the task of transcribing audio files in the Persian language into text. The dataset includes 60GB of data, consisting of audio files in the WAV format and their transcripts. Each CSV file corresponds to a single ZIP or RAR file, and the name of each CSV file is the same as the corresponding ZIP or RAR file. The CSV files contain the following columns:

  • wav_filename: The name of the WAV file within the ZIP or RAR file.
  • wav_filesize: The size of each audio file.
  • transcript: The text transcription of the audio file.
  • confidence_level: A measure of the accuracy of the transcription.

This dataset is the largest open source dataset of its kind, and it is a valuable resource for researchers and developers working on natural language processing tasks involving the Persian language. The open source nature of the dataset means that it is freely available to be used and modified by anyone, making it an important resource for advancing research and development in the field.

Files

Tehran_Aghrabe_79.csv

Files (152.1 MB)

Name Size Download all
md5:b656d64a8c8714eb175c7571b7cfc03a
184.8 kB Preview Download
md5:9b7d216a2f0df1f44d9c35dc3b885699
151.9 MB Download