TAPS: Throat and Acoustic Paired Speech Dataset (Korean)
Authors/Creators
Description
TAPS (Throat and Acoustic Paired Speech) is a paired speech corpus for deep learning–based speech enhancement, providing synchronized recordings from an accelerometer-based throat microphone and an acoustic microphone.
Contents:
- 60 native Korean speakers, gender-balanced (50/50)
- Total: 6,000 utterances, ~15.3 hours
- Splits: train (4,000 utt, 40 speakers), dev (1,000 utt, 10 speakers), test (1,000 utt, 10 speakers)
- No speaker overlap across splits
Files:
This Zenodo record contains ZIP archives for each split. The training split is provided as four ZIP files due to upload size constraints:
- TAPS_data_train_1.zip
- TAPS_data_train_2.zip
- TAPS_data_train_3.zip
- TAPS_data_train_4.zip
- TAPS_data_dev.zip
- TAPS_data_test.zip
To use the full training set, download and extract all four training ZIP files into the same target directory.
Inside each ZIP:
<split>/<speaker_id>/<sentence_id>/
- throat_microphone.wav (paired throat signal)
- acoustic_microphone.wav (paired acoustic signal)
- features.json (metadata)
Metadata fields (features.json):
- gender, speaker_id, sentence_id, duration
- text: original transcription
- normalized_text: normalized transcription (numbers spelled out in Korean; punctuation normalized)
- throat_microphone/acoustic_microphone: sampling_rate, num_samples, etc.
Use cases:
- throat-microphone speech enhancement (recovering attenuated high-frequency components)
- multimodal speech processing and related tasks
Project homepage and an alternative distribution (different file format) are provided in Related works. The accompanying paper is available on arXiv.
Files
TAPS_data_train_1.zip
Files
(2.7 GB)
| Name | Size | |
|---|---|---|
|
md5:7ac8b2f4cd6db1757ba9f711acc6ee7a
|
443.5 MB | Preview Download |
|
md5:7793d1272a4a0a3957a44014db283cbb
|
452.1 MB | Preview Download |
|
md5:f703d7122b698f358575fa11fbd4c93f
|
479.0 MB | Preview Download |
|
md5:e44b10a75c029d1e57efeee5ae7107d4
|
440.5 MB | Preview Download |
|
md5:a027cc1d2cbb2444126aa5bed2843b84
|
449.4 MB | Preview Download |
|
md5:f3963f416ca5a3ee0366e3c00e485efb
|
453.1 MB | Preview Download |
Additional details
Related works
- Is described by
- Other: https://taps.postech.ac.kr/ (URL)
- Is documented by
- Publication: arXiv:2502.11478 (arXiv)
- Is variant form of
- Dataset: https://huggingface.co/datasets/yskim3271/Throat_and_Acoustic_Pairing_Speech_Dataset (URL)