Published June 14, 2024 | Version v2
Dataset Open

CUCO Database: A voice and speech corpus of patients who underwent upper airway surgery in pre- and post-operative states

Description

The data set comprises 3,800 speech audio files of 3 types of upper respiratory tract surgeries and 1 control set. The dataset has an average of 35.51 +- 5.91 audio recordings per patient. It provides valuable resources to the scientific community to systematically investigate the objective effects of upper respiratory tract surgery on voice and speech.  

This data set is a complete corpus comprising data from 107 Spanish Castilian speakers. This corpus encompasses voice and speech recordings from both control speakers and patients who underwent upper airway surgical procedures in pre- and post-operative stages. The surgeries in focus include Tonsillectomy, Functional Endoscopic Sinus Surgery, and Septoplasty, all consistently performed by a single surgeon.

This corpus has been the basis for different previous studies to evaluate changes in voice and its quality due to surgery. The results do not suggest significant changes in the most relevant acoustic parameters studied for the voice, which is consistent with the initial hypothesis. However, the analysis of speech recordings remains open, with a special focus on the nasalised segments, which are expected to change due to surgical intervention.

This data set also opens the way to study the effect of upper airway surgery on the performance of speaker recognition and identification methods, as well as to be used to test anti-spoofing methodologies to make them more robust. 

Please, if you use this database, cite this open-access paper where the data acquisition is explained and detailed:

Hernández-García, E., Guerrero-López, A., Arias-Londoño, J. D., & Godino-Llorente, J. I. (2024). A voice and speech corpus of patients who underwent upper airway surgery in pre-and post-operative states. Scientific Data11(1), 746.

Files

cuco_db_v3.zip

Files (8.0 GB)

Name Size Download all
md5:66237eacabc442f417e877b84dbad92c
8.0 GB Preview Download

Additional details

Related works

Funding

Ministry of Economy, Industry and Competitiveness
DEPIA DPI2017-83405-R1
Ministry of Economy, Industry and Competitiveness
RADAR-PD PID2021- 342 128469OB-I00
Comunidad de Madrid