Mobile Device Voice Recordings at King's College London (MDVR-KCL) from both early and advanced Parkinson's disease patients and healthy controls
- 1. Fraunhofer IAIS, Department NetMedia, Sankt Augustin, Germany
- 2. King's College London, London, United Kingdom
Description
Dataset description
The dataset description will start with describing the local conditions and other metadata, then will continue with describing the recording procedure and annotation methodology. Finally, a brief description of the dataset deployment and publication will be given.
Meta Information
The dataset was recorded at King's College London (KCL) Hospital, Denmark Hill, Brixton, London SE5 9RS in the period from 26 to 29 September 2017. We used a typical examination room with about ten square meters area and a typical reverberation tome of approx. 500ms to perform the voice recordings. Due to the fact, that the voice recordings are performed in the realistic situation of doing a phone call (i.e. participant holds the phone to the preferred ear and microphone is in direct proximity to the mouth), one can assume that all recordings were performed within the reverberation radius and thus can be considered as “clean”.
Recording Procedure
We used a Motorola Moto G4 Smartphone as recording device. To perform the voice recordings on the device, we developed a “Toggle Recording App”, which uses the same functionalities as the voice recording module used within the i-PROGNOSIS Smartphone application, but deployed as a standalone android application. This means, that the voice capturing service runs as a standalone background service on the recording device and triggers voice recordings via on- and off-hook signals of the Smartphone. Due to the fact, that we directly record the microphone signal, and not the GSM (“Global System for Mobile Communications”) compressed stream, we end up with high quality recordings with a sample rate of 44.1 kHz and a bit depth of 16 Bit (audio CD quality). The raw, uncompressed data is directly written to the external storage of the Smartphone (SD-card) using the well-known WAVE file format (.wav). We used the following workflow to perform a voice recording:
- Ask the participant to relax a bit and then to make a phone call to the test executor (off-hook signal triggered).}
- Ask the participant to read out “The North Wind and the Sun”
- Depending on the constitution of the participant either ask to read out “Tech. Engin. Computer applications in geography snippet”
- Start a spontaneous dialog with the participant, the test executor starts asking random questions about places of interest, local traffic, or personal interests if acceptable.
- Test executor ends call by farewell (on-hook signal triggered).
Annotation Scheme
For each HC and PD participant, we labeled the data regarding scores on the Hoehn & Yahr (H&Y), as well as the UPDRS II part 5 and UPDRS III part 18 scale. The voice recordings are labeled in the following scheme:
SI_ HS_ HYR_ UPDRS II-5_UPDRS III-18
with
- SI as subject identification in the form IDNN, N in [0, 9]
- HS as the health status label (hc or pd accordingly)
- HYR as the expert assessed H&Y scale rating
- UPDRS II-5 as the according expert peer-reviewed score
- UPDRS III-18 as the according expert assessed score
For example, an audio recording with the file name “ID02_pd_1_2_1.wav” represents a recording of the third participant (First participant was anonymized as ID00), which has PD and a H&Y rating of 1, a UPDRS II-5 score of 2 and a UPDRS III-18 score of 1. At this point, it should be noted, that also all healthy controls were evaluated with regard to the introduced scales, because Parkinson's disease and voice degradation correlate, but don't match exactly. This means, that the data set includes one HC participant (ID31) with UPDRS II-5 and III-18 rating of 1, and also includes PD patients with UPDRS II-5 and III-18 ratings of 0. It should be emphasized, that this does not mean the data set includes ambiguous information, but that an expert was not able to hear voice degradation that would end up in a UPDRS rating greater than zero. Machine learning approaches may be able to nevertheless classify correctly, or at least learn to correlate, but not match PD and voice degradation at any time.
Appendix
North Wind and the Sun (Orthographic Version):
“The North Wind and the Sun were disputing which was the stronger, when a traveler came along wrapped in a warm cloak. They agreed that the one who first succeeded in making the traveler take his cloak off should be considered stronger than the other. Then the North Wind blew as hard as he could, but the more he blew the more closely did the traveler fold his cloak around him; and at last the North Wind gave up the attempt. Then the Sun shone out warmly, and immediately the traveler took off his cloak. And so the North Wind was obliged to confess that the Sun was the stronger of the two.”
BNC – Tech. Engin. Computer applications in geography snippet:
“[...] This is because there is less scattering of blue light as the atmospheric path length and consequently the degree of scattering of the incoming radiation is reduced. For the same reason, the sun appears to be whiter and less orange-coloured as the observer's altitude increases; this is because a greater proportion of the sunlight comes directly to the observer's eye. Figure 5.7 is a schematic representation of the path of electromagnetic energy in the visible spectrum as it travels from the sun to the Earth and back again towards a sensor mounted on an orbiting satellite. The paths of waves representing energy prone to scattering (that is, the shorter wavelengths) as it travels from sun to Earth are shown. To the sensor it appears that all the energy has been reflected from point P on the ground whereas, in fact, it has not, because some has been scattered within the atmosphere and has never reached the ground at all. [...]”
Files
26_29_09_2017_KCL.zip
Files
(606.1 MB)
Name | Size | Download all |
---|---|---|
md5:98c51bdd2b092b93f8bb038dea4505fa
|
606.1 MB | Preview Download |