A Novel Approach for Movement Evolution Tracking in Parkinson's Disease using Data Analysis and Fuzzy Logic

In this paper, a novel approach for the analysis of the movement evolution in patients with Parkinson's disease is presented. The system offers the capabilities of detecting significant degradations in the motor-skills of the patients according to the physiotherapy evaluations, where seven items are measured, including: posture, balance, walking, postural changes, involuntary movements, movement coordination and rigidity. To assess their evolution, two modules are employed: a data analysis module, which uses a clustering algorithm to distribute patients according to their skills and analyses their evolution based on the last three evaluations, and a Decision Support Tool based on a Fuzzy-Logic system, which measures the state of the patient according to the results from the mentioned data analysis module and generates a report per patient including his/her state at each item as well as recommendations considering convenient exercises to be practiced. Thus, the system provides meaningful information to physiotherapists, to support them in the decision-making process.


INTRODUCTION
Parkinson's Disease (PD) is a chronic progressive and complex neurodegenerative disease with a characteristic motor syndrome caused by a loss of dopaminergic neurons in the brain [10]. It is the most common neurodegenerative disorder after Alzheimer's disease and it a ects all aspects of daily living [11].
As the stage of the disease advances, some symptoms known as motor symptoms arise due to a motor dysfunction, including loss of movement (Akinesia), slowness of movement (Bradykinesia) and postural instability as well [12].
Although there are many medical therapies and surgical interventions for PD, patients develop a progressive disability over the time [17]. Hence, with the aim of improving or maintaining the quality of life of the patients and their autonomy as well, the physiotherapy can be recommended [7]. The role of physiotherapy is to maximize functional ability and minimize secondary complications through movement rehabilitation. Although some trials have shown that physiotherapy has short-term benefits in PD, it is not always very clear what exercises should be recommended to the patients for maximized e ectiveness [17].
Furthermore, the subjectivity of the professional in the movement evaluation makes harder the analysis and the tracking of patients regarding their motor skills over the time. To address these deficiencies and to support physiotherapists in the decision-making process of selecting specific exercises according to the patient's abilities, both data analysis and Decision Support Systems (DSS) are required.
Decision Support Systems can support clinicians and health professionals with patient-specific assessments or recommendations as an aid to clinical decision making and they are increasingly employed in health-care organizations [6].
In this paper, a data analysis and a posterior fuzzy-logic system [16], [18] are presented with the purpose of supporting physiotherapists in the decision-making process of selecting adequate exercises according to the patient's condition as well as organizing therapy sessions based on the physical state of their patients. Thus, the proposed system o ers innovative solutions to physiotherapists increasing the better understanding of the motor-skills degradation in PD as well as optimizing the distribution of the patients according to their motor-skills.
The rest of the paper is organized as follows: the related work section reviews the state of the art of ICT and Decision Support Tools for improving the quality of life of patients with PD. In the subsequent section, the system architecture and the main functionalities are presented as well as the relevant requirements, established by one of the physiotherapists at Asociación Parkinson de Madrid. The experimental results section presents the clustering methodology for the distribution of the patients, as well as the main breakthroughs provided by the system during the testing phase. In the last section, the final remarks as well as the future lines of work are included.

RELATED WORK
Existing research in this area mainly uses sensors to measure the movement disorders of the patients with Parkinson's disease and several DSS have been developed for the diagnosis of some of the motor symptoms.
In [13], the finger tremor is specifically studied using a tri-axial gyroscope to support physicians with the accurate diagnosis of Parkinson's disease and help them in discriminating along other movement disorders.
In [10], evolutionary algorithms are presented to induce classifiers which are able to recognize the movement characteristics of Parkinson's disease patients. The algorithms were trained and validated using movement recordings, collected and labeled by a recent clinical study.
Much complex systems as the one presented in [4], are developed to increase the life autonomy of elderly people who su er from cognitive and motor problems such as Alzheimer's or Parkinson's. In that system, a multi-sensor fusion scheme is capable of monitoring and storing data from multiple patients at the same time with the goal of improving their quality of life, as well as providing support to professionals in the posterior decision-making process.
However, the above-mentioned research are mainly focused on some specific motor symptoms such as the tremor or have other purposes including early diagnosis or classification of PD but they are not focused on maximizing the benefits of the physiotherapy sessions in patients with PD by the use of Data Science.
Employing both Parkinson scales such as the Hoehn & Yard scale [5] and the patient's scores obtained in the physiotherapy evaluation, a data analysis module and a fuzzy-logic system are proposed, having two main goals: to find patterns among patients, based on their movement evaluation scores and therefore, making possible to form di erent physiotherapy groups, and to provide meaningful information regarding the state of each patient for each motor-skill throughout both reports and recommendations. Thus, the proposed system will allow physiotherapists to continuously monitor the evolution of their patients and to include better therapies for those who tend to su er from significant deteriorations in the motor performance. In figure 1, the system architecture is presented. First of all, the data is extracted from the Parkinson's database of the Asociación Parkinson de Madrid (APM). The database contains personal information about patients such as their age, gender or the level of studies, as well as the score on the Hoehn & Yard scale and both the individual and the total scores on several physiotherapy evaluations. However, regarding the scope of the experiment presented in this paper, the physiotherapist decided to analyse the data of the last three available physiotherapy evaluations including the total score as well as the individual score on each item (posture, balance, walking, postural changes, involuntary movements, movement coordination and rigidity skills), due to the last three evaluations are the most relevant for detecting improvements or degradations in the motor-skills. The data analysis module applies a clustering algorithm to find patterns among patients and analyses the movement evolution of each patient over the last three evaluations. Finally, the module stores the results in a summarization file, using a JSON format which is provided to the Decision Support Tool module (DST) in order to feed the fuzzy-logic system.

P h y s io t h e r a p is t P a r k in s o n 's P a t ie n t s D e c is io n S u p p o r t To o l P a r k in s o n D a t a b a s e D a t a A n a l y s is
Subsequently, the DST processes the information from the data analysis module employing a fuzzy-logic model according to the physiotherapist criteria in order to simulate the human reasoning and to avoid making hard or binary decisions. The DST generates the final output of the system which contains a report for each patient with the following parts: (1) The personal information of the patient and the assignment to the best matching cluster. (2) The improvement or degradation percentages for the overall movement evolution and for each considered motor-skill, according to the last three evaluations. (3) The worry level, W l , which is a numerical indicator within the interval 0, 10 that measures the level of alarm for a given patient according to each item of the evaluation. (4) A report with recommendations, concerning the exercises that should be practiced by the patient, according to the results of the data analysis module and the fuzzylogic system.
All of these items are sent to physiotherapists and will allow them to optimize the therapies, in the benefit of their patients. More details about the functionalities of the system will be provided in the following sections.

Data Analysis Module
For achieving the goal of grouping the patients according to the obtained scores for di erent items of the physiotherapy evaluation, the data analysis module employs unsupervised learning methods such as clustering techniques. We employed in our study the K-means algorithm [9], which has as parameter the number of clusters K. Using di erent evaluation criteria, such as the Silhouette method [14] which is employed to study the separation distance between the resulting clusters, the optimum number of clusters was obtained. After finding the best number of clusters, they need to be validated by a professional in order to establish their semantic interpretation.
Furthermore, the second goal of the data analysis module consists of analysing whether a patient improved or degraded any of his/her motor-skills between two successive evaluations, contributing to movement evaluation. This type of analysis is performed using a statistical analysis, which evaluates the dynamics of each movement skill (posture, balance, walking, postural changes, involuntary movements, movement coordination and rigidity) during the time and computes the improvement/degradation percentage according to the following formula: where i and i ≠ 1 are the indexes of two successive evaluations, interval skill is the range of each movement skill and t oe 1, . . . , 7 is the index of the skill. It is important to determine the type of changes between successive evaluations, whether they are improving or degrading in order to assess the benefits of the physiotherapy and to correlate it with the prescribed medication. Accordingly, the degree of change is also important for analyzing the evolution of the Parkinson's disease. For example, in case the patient obtains an improvement or a degradation in any of the analyzed skills, this information helps the physiotherapist to adjust the exercises accordingly. Furthermore, a large degradation should be analyzed carefully to discover the reasons behind it. Finally, the period of time between two evaluations is also relevant, together with the moment in time when they are performed. For example, the improvement/degradation percentage could be weighted, using a parameter based on the number of months between two evaluations, while the common period between them is set to 6 months. Finally, the output of the data analysis module is fed to the Decision Support Tool for further reasoning and formulating recommendations for each patient.

Decision Support Tool
The Decision Support Tool (DST) consists of a fuzzy logic system which is launched if a significant degradation has been detected by the Data Analysis module. The fuzzy-logic system allows to model the motor-skills of the patients based on the scores on the physiotherapy evaluation and the Hoehn & Yard score as well, and generates as output, an indicator of the state of a patient for each motor-skill.
The physiotherapy evaluation relates the most relevant motor skills considered in Parkinson's disease including: posture, balance, walking, postural changes, involuntary movements, movement coordination and rigidity. The higher score a patient gets, the worse is his/her performance in that skill. In the following section, the modelling of the above items throughout the fuzzy-logic system will be further explained.

Fuzzy
Logic Implementation. A fuzzy logic system was firstly defined in [18] and consists of three modules: a Fuzzifier module which converts a crisp set of input data into a fuzzy input set through the use of Membership functions; an Inference module which includes a set of rules to model the system, and finally a Defuzzifier module which allows to convert the fuzzy output set into crisp data, where the term crisp data refers to the classical data that can be processed by all machines [2].
In our experiment, four di erent memberships functions were employed in the Fuzzifier module: a Triangular-Shaped function, presented in (2), a fuzzy Z-Shaped function depicted in (3), a fuzzy S-Shaped function defined by (4), and finally a Gaussian distribution.
where a, b, c have been chosen according to the physiotherapist's criteria and they are di erent for each motor-skill. For all the seven items as well as the Hoehn & Yard scale, a triangular membership function has been implemented to model them and five di erent intervals are built up. In figure  2, the triangular membership function for the walking item is presented, along with five intervals, which are used to characterize the state of the patient including: mild, mild-moderate, moderate, moderate-severe and severe. Furthermore, the output of the fuzzy logic system, which is the above-mentioned worry level, W l , has three di erent levels: low, medium and high, where the high level points out the bad state of the patient and the low level the opposite. Moreover, W l is developed by the use of a composition of three functions in order to generate soft intervals. The shape of such membership function is depicted in figure 3, where the low interval is modelled by a Z-Shaped membership function, the medium by a Gaussian Gµ = 4, ‡ = 1 and the high interval is modelled as an S-Shaped membership function.
To feed the fuzzy-logic system, the data analysis summarization file is required. As mentioned in section 3.2, the data analysis obtains di erent percentages for each item of the movement evaluation. These percentages will be used as thresholds to trigger the fuzzy-logic in the way that if such percentage is negative, it means a significance degradation in such item, and therefore, the DST should analyse the situation. In this step the fuzzification takes place and instead of working with numerical data, the system transforms them into a fuzzy input set according to the mentioned intervals from the Membership functions. At this point, the data is no longer represented by a number (e.g. 0, 10 or 25) but by a state (e.g. mild, mild-moderate, moderate, moderate-severe or severe). For instance, according to figure 2, if the original input for the walking item is 20, the fuzzifier transforms it into the moderate-severe state. Therefore, the numerical properties of the data have been transformed into an interpretation, useful for a human based analysis. Subsequently, an inference is used to build up the model based on several if-else rules in order to relate the di erent items of the evaluation and the stage of the disease according to the Hoehn & Yard scale. The rules have been validated by the physiotherapist of the APM team with the aim of integrating the human reasoning into an algorithm. Thanks to that, the system will provide such knowledge to professionals to make the proper decisions improving the conditions of the a ected patients. For instance, if the Hoehn & Yard score value for a given patient corresponds to the first stage of PD, (he/she has been diagnosed recently) and the system detects a significant degradation in the walking item (since the percentage provided by the data analysis module at walking is negative), then the worry level should be higher than if that patient has an advanced stage of PD, due to the inherent degeneration of this disease. Thus,it is more worrisome to su er from degradations when the stage of the disease is not very advanced, and the model will take that fact into consideration in order to obtain the alarm level.
Finally, in the defuzzification step, for each motor-skill, the fuzzy-logic system returns the mentioned worry level indicator, which is a numerical value that can be analyzed by professionals. Moreover, the DST generates a report with all the information as it is mentioned in section 3.1, including the personal information, the associated cluster, the movement evolution analysis as well as all the worry level indicators and their associated recommendations. Therefore, the output of the whole system combines on the one hand, the information provided by the data analysis module such as the clustering outcomes as well as the evolution analysis and on the other hand, the processed data provided by DST (worry level indicators and recommendations).

EXPERIMENTAL RESULTS
In this section, two di erent purposes are evaluated. The first one, is related to find patterns of patients according to their motor performance and create di erent clusters to support physiotherapists in the decision-making process of selecting groups of patients, hence improving the quality of the therapies. The second goal, consists of monitoring the movement evolution of patients with PD throughout the analysis of the last physiotherapy evaluation and generating both indicators and recommendations as a solution to prevent further motor deterioration in a short period of time.
The functionality of the whole system has been tested employing the database provided by APM and described in section 3.1. After applying a cleaning process to remove missing data, the final database contained 695 patients. In the first data analysis, as mentioned in section 3.2, a K-Means algorithm was employed for clustering the population and four groups were obtained. In figure 5, the tendency for each cluster regarding the movement evolution is depicted. In such figure, one can observe that the population from cluster 1 tends to improve according to their results in the evaluations whereas the population from both clusters 2 and 3 tends to su er from significant degradations. On the other hand, cluster 4 is made up of a more distributed population, although half of such population tends to either remain constant or even improve their motor-skills over the time.
To support these hypotheses, confidence intervals were computed to observe significant di erences among the total scores of the last three evaluations regarding the mentioned four clusters. Thus, 95% Confidence Intervals (CI) for the mean of the total score for each evaluation were built up and are shown in table 1. Moreover, one can observe in table 1, that the population from cluster 4 remains constant among the three evaluations as it was expected, whereas the patients who belong to clusters 2 and 3, are the ones who tend to su er from a higher motor deterioration, obtaining worse results over the time (the higher the score, the bigger the degradation). Patients in cluster 3 show a higher degradation than patients in cluster 2, w.r.t to the last evaluation. Moreover, the patients from cluster 1 seems to have a degradation between the first two evaluations, followed by an improvement in the last one. These results correlate with the Hoehn & Yard scale in the way that in cluster 4, most of the patients have a better stage of the disease in comparison to patients in cluster 2 or 3. Thanks to that, the physiotherapist can design specific sessions for each group, in order to maximize the benefits of the patients. For instance, more challenging exercises could be done in the therapies with patients from cluster 1 or 4, whereas mild exercises could be employed in the sessions for the population from clusters 2 and 3. Hence, the system achieves its first purpose: to support physiotherapists in the decision-making process of designing specific therapy groups, according to the motor-skills of the patients. Furthermore, if a new patient needs therapy, after evaluating him/her regarding the motor-skills, it would be possible to assign him/her to one of the formed clusters. Figure 4: Worry level indicator for a particular patient according to the analysis of the system for the walk item. In this case, the indicator is around 5, so the system will recommend to the physiotherapist that this patient needs to do more exercises related to that item.
After running the clustering algorithm, the data analysis was launched to detect significant deteriorations in the motorskills for all the available patients. A summarization file for each of them is stored with the information of the performed analysis. Subsequently, the DST takes and process the data from the mentioned file and feeds the fuzzy-logic system in order to get the indicators of the state of the patient on each item according to W l . If a patient has the expected evolution at one item according to his/her stage of the disease, W l will be zero for such item and therefore, no recommendation will be sent for that motor-skill. Otherwise, the fuzzy-logic model will obtain that W l > 0, as shown in figure 4, where the worry level, W l , associated to the walking skill for one patient is depicted. In that case, since W l = 4.95, the system will also send a recommendation to the physiotherapist in order to inform about this issue as well as to recall him/her to send proper exercises to that patient. Thanks to the proposed approach, professionals can have an accurate control over their patients and therefore, it allows them to make better and timely decisions to prevent considerable motor-deteriorations in PD patients.  Figure 5: Movement evolution tendency regarding the clusters, where Constant (red bar) means that the di erence among the total scores of the physiotherapy evaluations is not meaningful, whereas Improvement (blue bar) and Deterioration (green bar) indicate an improvement or a degradation respectively over the evaluations.

CONCLUSION AND FUTURE WORK
In this paper, a novel approach for supporting physiotherapists in the analysis and control of their patients according to the motor-skills was presented. By the use of a data analysis module, we were able to form well-distinguished clusters of patients according to the scores for di erent motor-skills. Thanks to that, physiotherapists can employ such classes or clusters to separate the patients in di erent sessions and therefore, to improve the quality of the therapy. Besides, if new patients are available, they can be assigned to one of the clusters according to the criteria of the system. Moreover, using a Fuzzy-Logic System, it is possible to integrate the physiotherapist's perspective into a decision support tool system in order to achieve a similar reasoning as the one provided by a professional. In this way, professionals are supported to make proper and timely decisions which will allow them to personalize the therapy for each patient according to their motor performance and evolution.
In this approach, some personal information of the patients, such as the level of studies or the profession were not employed in the analysis because of the large amount of missing data. In the future, including personal information in the clustering procedure may provide the system with relevant information and, thereby, improve the quality of the clusters. As a result, the groups can be better adapted for patients with similar characteristics and the deterioration of the disease could be, at least, slower. Besides, by the use of both more patients and more evaluations as well, powerful algorithms and other unsupervised machine learning techniques, such as Fuzzy C-Means [3] or Connectivity based clustering [8] could be employed to improve the reliability of the system. Furthermore, using sensors and cameras as it is proposed in the ICT system from [1], for automatic analysis of the motorskills performance, an increased accuracy in the physiotherapy evaluations could be obtained, by adding more objectivity when addressing the selection of the scores for each item of these evaluations.