Predicting Heart Failure Patient Events by Exploiting Saliva and Breath Biomarkers Information

The aim of this work is to present a machine learning based method for the prediction of adverse events (mortality and relapses) in patients with heart failure (HF) by exploiting, for the first time, measurements of breath and saliva biomarkers (Tumor Necrosis Factor Alpha, Cortisol and Acetone). Data from 27 patients are used in the study and the prediction of adverse events is achieved with high accuracy (77%) using the Rotation Forest algorithm. As in the near future, biomarkers can be measured at home, together with other physiological data, the accurate prediction of adverse events on the basis of home based measurements can revolutionize HF management.


INTRODUCTION
Heart failure (HF) is a chronic life-threatening condition characterized by high rates of mortality and rehospitalizations. The European Society of Cardiology reports that 26 million people worldwide suffer from HF and 74% of them present at least one comorbidity [1]. HF is characterized by frequent re-admissions to hospital. HF accounts for 1-3% of all hospital admissions, while almost the 24% of hospitalized patients are re-hospitalized within a 30-day and the 46% within a 60-day, post discharge period.
Across the world, the 2-17% of patients admitted to hospital with HF die while in hospital and the 17-25% die within one year of admission [2]. The cost of HF management is driven by hospitalizations, corresponding to 1-2% of total healthcare expenditure.
The ability to accurately predict the aforementioned undesirable events enables the effective risk stratification of patients and allows the clinical decision making. This valuable prognostic information can guide the clinical experts in the adaptation of patient management and in the selection of the best treatment plan that should be followed. In turn, this is expected to improve the quality of care provided to the patients, while in parallel result in better health outcomes. Towards this direction, different factors have been studied for their predictive ability in HF morbidity and mortality, destabilizations and re-hospitalizations. In addition, several studies have been conducted focusing on the simultaneous examination of multiple factors using statistical methods (e.g. multi-variable Cox regression models). Such studies resulted in the creation of acknowledged in the clinical practice scores: (i) for the estimation of risk for mortality, the Heart Failure Survival Score [3], the Get With the guidelines score [4], the Seattle Heart Failure Model [5], the EFFECT [6], (ii) for rehospitalizations [7], and (iii) for morbidity [8].
Moreover, progress in analytical chemistry and biosensor development allows some of them to be detected in saliva and breath [43][44][45][46]. Uric Acid, Tumor Necrosis Factor Alpha (TNF-a), a-Amylase, Lactate, Cortisol and 8-iso-prostaglandin F2a, are among the most important saliva biomarkers, while Acetone (2-Propanon) and 2-methy-1,3-butadiene (isoprene) are indicative examples of breath biomarkers that play a key role in the patient diagnosis and prognosis.
The goal of this study is to introduce such biomarkers in the adverse event prediction process. Obtaining saliva and breath biomarkers is non -invasive and in a future setting can be performed at home [47], becoming on this way a significant tool for HF patient management. In our study, we employ these breath and saliva biomarkers in a machine learning approach which combines heterogeneous patient data (i.e. sociodemographic, clinical, sensor data and biomarkers) for the prediction of adverse events.

A. Dataset
The proposed method is evaluated using a dataset of 27 patients collected by the clinical center of the Universita Di Pisa (UNIPI), Italy within the framework of the HEARTEN project [47]. The criteria for patient selection are reported in Table I. The features recorded for each patient can be grouped to the following categories (Table II). Adherence Experts estimation regarding adherence of patients in terms of medication, activity, and nutrition and the prediction of the medication adherence risk of the patient which is extracted by the Adherence risk module of the HEARTEN project. Score Five scores are computed; European Heart Failure Self-care Behavior Scale 12-item scale for evaluating HF self-care [48], [49], Heart Failure Knowledge score that is related to HF knowledge in general, knowledge on HF treatment, symptoms recognition and occurrence [50], Get with the guidelines for estimating the in-hospital mortality [51], Seattle Heart Failure Model for predicting the 1-, 2-, and 3-year survival of HF patients [5], Minnesota Living with Heart Failure for providing feedback regarding the physical and emotional status of the HF patient [52].

Sensor data
Time and frequency domain Heart Rate Variability features extracted from the electrocardiogram (ECG), as well as respiration rate, weight and activity related data.
Based on clinical studies on biomarker behavior and influence, performed during the HEARTEN project, the following biomarkers are selected as most prominent marker compounds for monitoring HF conditions: (i) acetone in breath, (ii) cortisol and TNF-a in saliva. More specifically, acetone mirrors metabolism, as well as metabolic stress and the concentrations of acetone are elevated in HF patients compared to healthy subjects. A very significant increase in salivary cortisol levels is observed in peculiar cases when a sudden worsening of patients happened during hospitalization. In HEARTEN studies, after therapy adjustment, the cortisol decreased by a factor of about 2. As a consequence, cortisol was considered a good candidate for monitoring the HF patients. Additionally, chronic HF In total, 263 features are recorded for each patient (3 features corresponding to biomarkers, 151 features extracted from sensor data and 109 features corresponding to the other categories).
These features are recorded from the first time of patient's hospitalization until discharge, every second day. Thus, 141 instances are collected from all patients. It must be mentioned that the number of the hospitalizations days is not the same for all patients. On average, each stay lasts approximately 4 days. The dataset given as input to the proposed method is created with the assumption that the discharge instance of each patient is considered as event free ("no event" class), while the instance of the patient when the event was presented and/or the first hospitalization took place is considered as event ("event" class). This results to a set of 54 instances, with 25 corresponding to an event and 29 to a no event.

B. The proposed method
The proposed method consists of three steps: (i) preprocessing, (ii) feature selection, (iii) classification. A schematic representation of the proposed method is shown in Fig. 1 and a detailed description of each stage is provided below.
In the first step, missing values are addressed. Features with more than 60% of missing values are removed, since imputation of missing values cannot be performed due to the nature of the data.
Furthermore, features where the distribution between the values is greater than 80% are not retained. In the second step, the identification of features that can act as discriminators between the two expected situations (presence of an event or not) are selected following a wrapper approach [53], in combination with the classifiers employed in step 3. Two different approaches are tested. The first one (Fig. 1i) takes as input all the features, while in the second (Fig. 1ii), the method is applied separately to features extracted from sensor data and to features corresponding to categories (i)-(vii). Finally, in the third step, nine classifiers are employed and tested [54]

III. RESULTS
The proposed method is evaluated on a dataset of 54 instances, while the number of features is differentiated depending on the outcome of the feature selection step. The obtained results in terms of accuracy (Acc), sensitivity (Sens), specificity (Spec) and area under curve (AUC) are presented in Table III. Both approaches produce rather similar results (accuracy 76% and 77%, respectively), with 12 features to be finally selected in the first apporach and 23 features in the second approach. Rotation Forests (ROT) seem to be the best performing classification algorithm (see Table IV).
In order to evaluate the contribution of biomarkers in the prediction of adverse events, the following experiments are made: (i) all available features are given as input, (ii) features only from sensors are employed, (iii) features only from biosensors are utilized as predictors, and (iv) features from sensors and biosensors are met. The results without feature selection are presented in Table V

Case Model Acc
(i) All features without feature selection ROT 57% (ii) Features only from sensors without feature selection RF 60%

(iii) Features only from biosensors without feature selection
CART 66%

(iv) Features from sensors and biosensors without feature selection
ROT 59% presented in Table VI are extracted from the same experiments, as those presented in Table V, but this time following feature selection.
It should be noted that in case (iii) where only biomarkers are utilized, the feature selection approach is not applied due to the already small number of biomarkers used (i.e. three).
As shown in Tables V and VI, our approach yields superior results. The positive effect of feature selection is also clear (accuracy improvement from 66% to 77%).
A comparison of the proposed method with those reported in the literature (Table VII) cannot be directly performed since the studies reported in the literature: (i) predict the presence or not of one specific adverse event only (destabilizations, re-hospitalizations, mortality) and not the presence or not of HF adverse event in general like the proposed method, and (ii) do not utilize biomarkers. Focusing on specific adverse events, of course it has advantages for clinical practice; still it requires a much larger dataset. In this sense, this can be considered as a limitation of the proposed method. This will be addressed in the future through the data that will be collected during the pilot phase of the HEARTEN project. The utilization of breath and saliva biomarkers is the innovative feature of the proposed method.

IV. CONCLUSIONS
An automated method for the prediction of adverse events related to HF, utilizing information from saliva and breath biomarkers, is presented.
Different experiments are conducted in order the proposed method to be evaluated and the contribution of biomarkers to the prediction problem to be estimated. The results confirm the prediction ability of biomarkers either if they are employed as the only input (Acc: 66%) or in combination with other categories of features (Acc: 77%). Among the biomarkers, TNF-a is the one that presents the largest correlation with the prediction of an adverse event. However, the small number of instances does not allow the extraction of "safe" conclusions. The collection of biomarkers measurements from a larger number of patients will lead to a more in-depth evaluation.
As in the near future biomarkers can be measured at home, together with biosensor data, the accurate prediction of adverse events on the basis of home based measurements will revolutionize HF patient management. Such an approach can become the core of a chronic care model, allowing for early action by both, patients and physicians.
ACKNOWLEDGMENT This work is supported by the HEARTEN project that has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No 643694.