Published January 23, 2026 | Version v1

TAPS: Throat and Acoustic Paired Speech Dataset (Korean)

  • 1. ROR icon Pohang University of Science and Technology

Description

TAPS (Throat and Acoustic Paired Speech) is a paired speech corpus for deep learning–based speech enhancement, providing synchronized recordings from an accelerometer-based throat microphone and an acoustic microphone.

 

Contents:
- 60 native Korean speakers, gender-balanced (50/50)
- Total: 6,000 utterances, ~15.3 hours
- Splits: train (4,000 utt, 40 speakers), dev (1,000 utt, 10 speakers), test (1,000 utt, 10 speakers)
- No speaker overlap across splits

 

Files:
This Zenodo record contains ZIP archives for each split. The training split is provided as four ZIP files due to upload size constraints:
- TAPS_data_train_1.zip
- TAPS_data_train_2.zip
- TAPS_data_train_3.zip
- TAPS_data_train_4.zip
- TAPS_data_dev.zip
- TAPS_data_test.zip

To use the full training set, download and extract all four training ZIP files into the same target directory.

 

Inside each ZIP:
<split>/<speaker_id>/<sentence_id>/
- throat_microphone.wav (paired throat signal)
- acoustic_microphone.wav (paired acoustic signal)
- features.json (metadata)

 

Metadata fields (features.json):
- gender, speaker_id, sentence_id, duration
- text: original transcription
- normalized_text: normalized transcription (numbers spelled out in Korean; punctuation normalized)
- throat_microphone/acoustic_microphone: sampling_rate, num_samples, etc.

 

Use cases:
- throat-microphone speech enhancement (recovering attenuated high-frequency components)
- multimodal speech processing and related tasks

 

Project homepage and an alternative distribution (different file format) are provided in Related works. The accompanying paper is available on arXiv.

Files

TAPS_data_train_1.zip

Files (2.7 GB)

Name Size
md5:7ac8b2f4cd6db1757ba9f711acc6ee7a
443.5 MB Preview Download
md5:7793d1272a4a0a3957a44014db283cbb
452.1 MB Preview Download
md5:f703d7122b698f358575fa11fbd4c93f
479.0 MB Preview Download
md5:e44b10a75c029d1e57efeee5ae7107d4
440.5 MB Preview Download
md5:a027cc1d2cbb2444126aa5bed2843b84
449.4 MB Preview Download
md5:f3963f416ca5a3ee0366e3c00e485efb
453.1 MB Preview Download

Additional details

Related works

Is described by
Other: https://taps.postech.ac.kr/ (URL)
Is documented by
Publication: arXiv:2502.11478 (arXiv)
Is variant form of
Dataset: https://huggingface.co/datasets/yskim3271/Throat_and_Acoustic_Pairing_Speech_Dataset (URL)