Monitoring Parkinson's Disease Progression Using Behavioural Inferences, Mobile Devices and Web Technologies

Parkinson's Disease (PD) affects patients' motor and non-motor functionality. Traditional assessment techniques are inaccurate because PD symptoms vary throughout the day and are evaluated in sporadic and subjective sessions. Although recent works have utilised wearable devices to try to overcome these issues, most are unsuitable for following patients regularly for a long time. In contrast, my approach aims to monitor PD continuously in a longitudinal, naturalistic, non-disruptive and non-intrusive way. It uses smartphones to log and transmit over the Internet social, environmental, and interaction data about patients and their surroundings. This data is complemented with other web data sources (i.e., geographical and weather data) and then processed to infer a set of metrics (a latent behavioural variable or LBV) of people's activities and habits. Then, the LBV's trends are measured and mapped to the progression of the disease. As a part of the pilot study to test the proposed methodology, I have collected ~290 million records from 2 patients, making this dataset 34.5x bigger and 4x richer than state-of-the-art sets. I used the collected data to identify six possible PD-related LBVs. This project aims to get a more accurate disease picture and to reduce the physical and psychological burden of traditional assessment methods. Ultimately, the work has the potential to save patients' time and improve the efficiency and effectiveness of health services.


PROBLEM
PD is a neurodegenerative disorder affecting around seven to ten million people worldwide [21].It worsens patients' quality of life, especially in the elderly population.PD has motor symptoms such as rest tremor, bradykinesia (slow movement) or rigidity and non-motor symptoms like cognitive impairment, behavioural and psychiatric problems or sleep disorders, among others [12].
Traditionally, the severity of PD symptoms is quantified using clinical scales during regular visits of patients to health centres [9].However, this approach is unsuitable for longterm, recurrent monitoring because it is subjective, expertisedependant [26,34] and prone to recall [11,23,31] and cognitive bias [4].Likewise, short assessment sessions provide an inaccurate picture of PD as its symptoms vary throughout the day [24,25].Moreover, it is infeasible to check a patient at a hospital every day all day.Thus, a sporadic assessment makes it difficult to tailor treatments to the patient's real condition [32,35].
Some of these issues are tackled using electronic devices to assess PD objectively and automatically.Nevertheless, the device type, the way it is used and the chosen monitoring methodology have different outcomes.Although wearable devices are a popular choice, they are often attached to uncomfortable body locations and patients need to follow scripted assessment routines.This makes it impractical to monitor PD for a long time outside the laboratory.
I believe this can be done differently.I hypothesise that PD severity can be assessed by quantifying the disease's effects on patients' daily life activities and habits.Furthermore, such human behaviour can be inferred from data gathered with a smartphone and from complementary sources.This way, I expect to reduce the physical and cognitive burden of traditional and other technology-supported techniques by taking advantage of the smartphone's ubiquity.

STATE OF THE ART
Technology-supported PD monitoring works are objective, more concise and more precise than traditional methods.They can be ambient based (sensors installed in a room), video based (cameras recording patients) and wearable based (people porting the devices).The latter have received most of the attention because they can monitor people outside delimited areas, register fine motor body movements and can be cheaper and easier to set up.
Most wearable based approaches focus on motor symptoms like tremor [23,35], bradykinesia [22,23,32,35], gait disturbances [3,17,20,35,36], voice alterations [6,34] or motor fluctuations [25].Among these approaches, the use of accelerometer data has wide popularity, sometimes complemented by gyroscope data, audio recordings or video recordings depending on the monitored feature.Although several of these projects have positive results quantifying PD symptoms, the monitoring methodology they use can be unsuitable for long-term and in-the-wild assessments.This is because participants need to wear numerous devices, to carry them in uncomfortable body locations and/or to perform evaluation tasks that interfere with their daily routines.
Recently, smartphones have been used to monitor PD.For example, location data collected over eight weeks is used to calculate a metric ('lifespace') representing a person's movement patterns [16].Other work has assessed hand tremor [5,13,15,19,30], speech [2,30], facial tremors [30], motor activities [1], gait disturbances [2,19,24], upper-limb bradykinesia [27] and posture, finger tapping and reaction times [2].They use the inertial sensors, the touchscreen or the camera of the device.Similarly to single-purpose devices, almost all identified smartphone-based projects monitor participants during a single day and/or performing short scripted tasks under (semi-)controlled conditions.Although [16] was the exception and participants were followed doing their regular activities, the authors only used one data source (GPS), finding a suggested relationship to PD clinical scores.I believe this is an idea that can be further explored and improved.

PROPOSED APPROACH
My work will monitor PD in a longitudinal, naturalistic, non-intrusive and non-disruptive way.Additionally, I will follow a macro-scale approach assessing trends of human activities and habits instead of measuring fine motor movements (micro-scale) which has been the focus of previous research.Furthermore, I propose to combine multiple data sources (multi-source) to simplify the quantification of complex behavioural features.Altogether, these six monitoring attributes are novel in the context of PD progression assessment.
I defined three research questions: a) Can complex human behaviour be inferred from smartphone collected data?, b) Can PD be monitored via such behavioural inferences?, and c) Can human behaviour be linked to PD severity?
My project aims to answer these questions by exploring PD progression assessment using Latent Behavioural Variables (LBV) derived from heterogeneous data.Such data is collected using a smartphone and processed to infer behavioural metrics.I define a LBV as a set of metrics that quantify a particular human activity or habit.For example, if I consider typing on the device's keyboard as an activity, I can measure patient's typing speed, halt frequency and the number of errors as determined by the use of the backspace key.These LBVs can be analysed over an extended period to identify trends and obscure outliers.Thus, the trends' changes can be mapped to the evolution of the disease.Due to the size and complexity of this problem, I will focus on a 'thin slice' identifying at least one LBV.
The objectives of the project are four.First, to analyse the available smartphone capabilities to collect heterogeneous data.Second, to conduct a pilot study to define and refine a monitoring methodology according to the project aims.Next, to explore inference techniques for determining people's behaviour based on the collected data.Finally, to analyse the relation between the found inferences and PD severity.

METHODOLOGY
I propose a four stage methodology for PD monitoring (Figure 1).
1. Data collection.Patient's social, environmental and interaction data is gathered from all the sensors and interfaces within a smartphone.This is complemented by ambient, spatial and other web data sources.
2. Data processing.Raw data is modified, filtered and ranked to reduce the complexity and dimensionality of the original dataset.
3. Data analysis.This stage has two tasks, LBV Identification and 'Profile of Living' (PL) generation.In the first one, a human activity or habit is inferred from various data sources.Then, a LBV (a set of metrics) is computed based on these inferences.During the second task, a LBV's evolution is quantified using a PL.
The PL is a proposed metric created by dividing the LBV's metrics into two groups.The ones obtained from the beginning of the monitoring period produce a personal baseline while the rest are considered deviations over time.If we go back to the typing example, a decrease in typing speed after a few months of being constant could be a sign of motor deterioration.
4. Evaluation.PL variations (shifts in behaviour) are mapped to changes in PD severity and are evaluated using clinical scores produced at regular intervals (every few months) by trained staff as a ground truth.These scores will come from the Movement Disorder Society-Unified Parkinson's Disease Rating Scale (MDS-UPDRS) [9] that assess motor and non-motor symptoms including activities of daily living.Currently, this scale is "accepted by the US and [the] European Union as a reliance on new drug approvals, studies on placebo response and trials of surgical interventions" [7] and is widely used as a golden reference in research projects outside clinical environments [18].Thus, I will study the correlation of the magnitudes of PL changes of each LBV with the disease's MDS-UPDRS scores and subscores.Besides this and as a secondary measure, I will analyse the trends of the different LBV(s)' metrics, to see if their changes are related to each other and to the theoretical progression of the PD symptoms they quantify according to the literature.
I expect two main contributions from this approach.The first one is a methodology to investigate behavioural LBVs related to PD using time series data.The second one is the identification of at least one LBV.This process includes the development of the 'Profile of Living' metric and algorithms for behaviour inferencing and for multi-source time series analysis.Finally, if the LBV(s) is correlated with PD severity, this work will be a proof of concept of non-intrusive and non-disruptive PD monitoring based on passive mobile sensing.

RESULTS
To first test this methodology, I carried out a pilot PD monitoring study where 29 types of data sources (smartphone's interfaces and sensors, and web sources) were logged from two patients over 83 days using the Android app AWARE   1).This means, D1 is 34.5x bigger than D3, and although it is 5x smaller than D2, the latter has only one source.Furthermore, when compared to other smartphone datasets (D4, D5, D6) in the context of behavioural inferencing (but not PD), D1 has 5.6x more R/P/Hr than the closest dataset.This approach to data collection will increase the potential for inferring complex PD-related behavioural habits.Next, I identified six LBVs, each composed of several metrics, after analysing the collected data, the symptoms of PD [12], the assessment tasks of the MDS-UPDRS [9] and everyday human activities or habits that might be influenced by PD symptoms according to what other works have measured using alternative approaches (i.e.[5,16]).The LBVs to consider in the future are typing patterns, phone usage patterns, episodes of going up/down stairs, participant's indoors routine, motor activities (e.g., walking) and social patterns.Other LVBs might be identified based on the data collected in our next PD monitoring study.
The 'social patterns' LBV combines Bluetooth (BT), WiFi, location, spatial, calls and messages data.It has metrics like BT surrounding devices (potential indicator of human presence), communication patterns (potential indicator of a mood change), frequented places' visiting routine (potentially related to people's mood and physical fitness, i.e. going to a park versus a casino), among others.I preliminarily analysed this LBV using three days of data from one participant.The Figure 2a and 2b show the WiFi Access Points (AP) and the BT devices detected around the phone, respectively.In these two plots, there are two periods of ≈4.5 hrs and ≈2.5 hrs each during which the participant was outside the home.In Figure 2c, spatial data was used to infer that the patient spent his/her time at a park, a supermarket, a community centre and a residential area.In this map, the participant's house is identified using the phone's last known position and three APs detected with a strong signal during the night.In the BT plot, a sporadic device was recorded while the person was at home indicating social interaction with a visitor or relative.Finally, in Figure 2d there is a list of the calls made or received by the participant, three of them placed while he/she was at home.All these parameters can be weighted and put together to generate a social interaction score.
Following a similar process, other LBVs and their metrics can be analysed.The next step is to take into account data from a longer period and study patients' routines to then evaluate these inferences against the PD clinical scores.

CONCLUSIONS AND FUTURE WORK
The pilot results provide good evidence that PD progression monitoring based on behavioural inferences extracted from data collected using mobile devices and web data sources is feasible.Due to the exploratory nature of this work, there are several LBVs and inferences that can be analysed, all promising leads that might be related to PD severity.
The executed pilot study will serve as a source of insights and best practices for data collection and analysis for the next monitoring study.This new study is awaiting the approval of the relevant ethics committees and will include five patients monitored for 12 months.
During and after the main study, I will analyse the collected data to identify behavioural LBVs as mentioned earlier.Because human behaviour is a complex phenomenon that can be segregated and interpreted in different ways, there are many opportunities to apply techniques that have had positive results from slightly similar contexts (e.g., text mining, fuzzy logic, etc.).This is a topic I would like to discuss during my participation in the WWW Ph.D. Symposium.
If there is a positive outcome at the end of this proof-ofconcept PD monitoring methodology, future work under the same line of research could have a significant impact on the quality of life of PD patients by saving them time, reducing the physical and psychological burden related to traditional and alternative assessment methods, and improving the precision of treatments and interventions.This will help to reduce the clinicians' workload and improve the efficiency of health services.

Figure 1 :
Figure 1: Proposed methodology for PD monitoring (a) WiFi access points scanned by the smartphone, it is possible to identify the periods where the patient was in and out their home.(b)Bluetooth access points scanned by the smartphone, it is possible to identify the periods where the patient was in and out their home and an isolated device which could indicate social interaction.(c)A map with the patient's location, it is possible to identify their frequented visited places.(d)Calls made and received by the patient, it is possible to identify the calls made from home.

Figure 2 :
Figure 2: Different mobile and web data sources combined to extract information on different aspects of human social behaviour during a 3-day period for the 'social patterns' Latent Behavioural Variable.

Table 1 :
Smartphone collected datasets for behaviour inferencing