An automatic screening approach for obstructive sleep apnea from photoplethysmograph using machine learning techniques

ABSTRACT


INTRODUCTION
Sleep disorders has now become a very common health condition affecting about 2 to 4% of the adult population and have effect on several aspects of life. Among the sixty different sleep disorders identified by the International Classification of Sleep Disorders [1], obstructive sleep apnea (OSA) is one of the most common one, characterized by episodes of complete, intermittent or partial obstruction and repetitive collapse of the upper airway during sleep. Research reports on the prevalence of the syndrome among the adult population estimates that it is much higher i.e. about 50% in patients with cardiovascular and cerebrovascular diseases [2]. Obstructive sleep apnea is currently diagnosed using polysomnography (PSG) and is considered as the gold standard. PSG is usually performed in sleep labs as an overnight sleep study.

BACKGROUND
It is important to understand the physiological factors that support the theoretical estimation of OSA characteristics from the PPG signal. The following session describes the relation between these physiological factors and the signal processing techniques used to extract the features relevant to changes in PPG due to OSA. The hemodynamics and the cardiovascular activity of an OSA patient sway between ventilatory and apneic periods, due to recurrent apneas. The consequences of repetitive surges curb the normal heart rate and blood pressure and thus contribute to the adverse effects on the cardiovascular activity. Hypoxia-a condition where oxygen supply to the tissues are reduced, magnified negative intra-thoracic pressure prompted due to obstructed pharynx and sleep arousals are the main pathophysiological features of OSA. The variability in heart rate may differ among individuals and it depends on the hypoxia severity and intrinsic hypoxic chemo sensitivity.

Pathophysiology of OSA syndrome
Obesity, thickened pharyngeal walls, reduced muscle tone of naso-pharynx during sleep; hypertension and other pathologies contribute to OSA [11]. It also affects the general hemodynamics and the state of the autonomous system and it is related to the individual's demographic and morphologic parameters [12]. Thus, measurement of parameters like age, sex, weight, and height and body mass index remains a pre-requisite.

OSA and blood viscosity
Blood viscosity is the internal resistance offered by the blood against shear forces and is determined by the viscosity of plasma, hematocrit and the biomechanical properties of red blood cells [13]. Changes in the rheological properties of blood and plasma leads to an increase in blood clotting and perhaps remains to be a vital factor in triggering cardiovascular complications due to OSA [14]. There is strong evidence that the blood viscosity and the plasma viscosity are abnormally high in OSA patients [15]. The fluctuations in the blood pressure adversely affect the response of blood vessel wall and thereby modify the vessel compliance [16]. The effect of blood viscosity and blood vessel compliance are reflected on the shape of PPG waveform.

OSA and heart rate variability
The association between severity of OSA and heart rate variability has been studied extensively by doing time domain and frequency domain analysis of heart rate variability (HRV), sleep arousals, oxygen desaturation and other sleep parameters. The frequency domain indices tend to yield a better result when compared to time domain indices [17]. Henceforth, the variability in the heart rate due to OSA is reflected in the power spectrum of the heart rate.

OSA and baroreflexes
This relates the heart rate and the blood pressure. The fact underlying is that, an increase in blood pressure activates the carotid sinus and aortic arch baroreceptors which inhibits the sympathetic outflow and reduces the heart rate. The opposite effect is experienced due to a decrease in blood pressure [18]. Solaro et al. [19] describe the spectral properties of heart rate variability and its relation to blood pressure levels. Fazan Jr et al. [20] have investigated the relation between heart rate variability and human baroreflexes.

OSA and breathing rate
It has been made clear through several studies that the respiratory frequency has effects on blood pressure levels [21]. Signal processing techniques to extract the low frequency components of PPG signal help in measuring the respiration rate. Due to the well-known fact of high non-linearity and non-stationary properties of biosignals, empirical mode decomposition (EMD) algorithm is best suited and is capable of accurately estimating the respiratory rate from PPG signal. This provides an alternative to using a separate sensor module for monitoring respiration rate.

METHODOLOGY
A detailed account of the materials and methodologies used in the various modules of the proposed system are described in this session. In contrast to the above discussed methods the approach proposed in this paper measures the difference in the transit time between the successive PPG waveform and also measures the repercussions of physiological alterations inflicted on the shape of the PPG waveform and on the heart rate.
The proposed system is organized into three modules as depicted in the flow diagram shown in Figure 1. The PPG waveform obtained from a simple fingertip PPG sensor moves through a clean signal detector module that eliminates corrupted unclean signals which set foot in due to artifacts and finger movements. This module ensures that the system does not require any re-calibration with change in individuals or change over time. Uncorrupted and consecutively recorded 4000 samples are processed using moving average algorithm to further smoothen the signal. The signal processing and feature extraction module processes the signal to extract several related features from the clean PPG waveform. Finally, a classifier module makes a decision between the normal and abnormal values of the variables with the help of the extracted features fed to it.  The training and testing of the system was carried out with 285 subjects encompassing individuals from varied age groups, healthy, individuals with hypertension and cardiovascular disease, and OSA patients. The input PPG signal recorded with the help of a fingertip pulse oximeter device (German sensor) and the reference apnea signals obtained from the apnea database from Physionet were used for training the system. The database includes 169 male and 116 female subjects, aged between 20 to 85 years. The database widely incorporates young healthy individuals, diabetic patients, blood pressure patients and sleep apnea patients with a focus to correlate the symptoms common to sleep apnea, blood sugar and blood pressure. Real time digitalized PPG signals recorded using a universal serial bus (USB) finger-tip pulse oximeter sensor, are obtained as a sequence, having a sampling rate of 1000 samples per second. The PPG signal is given as input to the clean signal detection module which identifies a clean and uncorrupted signal X(T) incorporated with several segments/frames having a frame length of 4000 consecutively recorded samples. Time duration of two minute was selected so as to accommodate sufficient samples for estimation of heart rate and respiratory rate from the PPG signal. The output of the clean signal detector is fed to the signal processing module, whose output vector XF containing a set of features are fed to the machine learning module.

Clean signal detection module
The ultimate aim of the clean signal detector is to eliminate the corrupted signal length as shown in Figure 2 and to select the frame that has clean signal as depicted in Figure 3. The PPG signal recorded at the fingertip is usually vulnerable to spurious peaks, noise generated due to movements, signal distortion due to initial transient irregularities, and signal saturation. The output of the clean signal detector is a vector consisting of 'n' number of frames with each frame holding 4000 clean data samples obtained from the PPG signal. Time required to acquire one frame of 4000 consecutive samples is 4s, which is capable of trapping several heart beats. Reducing the frame length below 4s might grasp very few heart beats resulting in an unreliable data. In order to discriminate between clean and corrupted signal, a set of morphological features are extracted from the time-domain PPG waveform, its first and second derivative.

Set of morphological features
The morphological features of interest extracted for feature vector were pulse height/systolic peak, diastolic peak, augmentation index, pulse interval, pulse width, pulse area, peak to peak interval, notch time, systolic peak time and diastolic peak time as shown in Figure 4. Pulse width i.e., the time between onset and end of pulse wave clearly correlates with the vascular resistance which is due to prolonged increase in blood pressure. Table 1 defines the set of morphological parameters extracted. Pulse area is measured as the total area under the PPG curve, which is also used as an indicator of total peripheral resistance. The augmentation index quantifies the contribution made by the wave reflection on the systolic arterial pressure [22]. In addition to the above-mentioned morphological features which evaluate the cardiovascular functions, the maximum amplitude of the first derivative and second derivative of the PPG waveform is also incorporated in our study. The velocity of blood flow in the finger is represented by the first derivative and its second derivative speaks much about the arterial stiffness which is an evident symptom of OSA. The pulse transit time measurement (PTT) in OSA patients depicts that there is an increase in PTT due to arterial stiffness. The energy profile, TK x(t) and spectral power distribution of PPG signal, Spen of each frame were estimated using Teager-Kaiser energy operator and spectral entropy operator respectively [23], [24].  Maximum amplitude of the signal y (diastolic peak) Amplitude of the diastolic peak y/x (augmentation index) Amplitude of diastolic peak/amplitude of the signal tpi (pulse interval) Time taken for one complete PPG period tpp (peak to peak interval) Time elapsed between two successive peaks t1 (systolic peak time) Time at which systolic peak occur t3 (diastolic peak time) Time at which diastolic peak occur Nt (Notch time) Time at which notch occur SD of amplitude Standard deviation of the amplitudes 1 st derivative_max amp Maximum amplitude of the first derivative 2 nd derivative_max amp Maximum amplitude of the second derivative

Classifier algorithm
The feature vector X n (f) is thus constructed by clustering all the morphological features and energy levels of PPG signal contained in each frame of 4000 samples.
X n (f) = [ x, y, y/x, tpi, tpi, t1, t3, Nt, SD, 1 st derive max_amp, 2 nd derive max_amp, TK x(t), Spen] T The classifier takes X n (f) as its input and distinguishes between good signal and bad signal based on noise and signal loss. A threshold-based classifier makes a classification decision based on a value of linear combination of all the above stated features. The thresholds initially were resolved by inspection from a set of examples and trained using a set of hand-labeled database. The algorithm is designed in such a way to eliminate the incoming signals for the first 15 seconds so as to eliminate the spurious clicks and motion artifacts in the signal due to finger movements. The subjects were advised to be in a relaxed state at the time of recording the real time PPG signals using USB based finger-tip pulse oximeter, with a sampling rate of TELKOMNIKA Telecommun Comput El Control  An automatic screening approach for obstructive sleep apnea from … (Smily Jeya Jothi. E) 1265 1000 samples/sec. i.e., each data acquisition provides 1000 samples. Data samples obtained from 4 consecutive acquisitions recorded for 4s, are stored in an array which forms one frame/segment having 4000 samples altogether. The classifier works on the basis of certain ordained rules and functions written by hand after analyzing various signals and makes a decision between good and bad frame/segment. Once a clean signal of a minute or more consecutive samples were obtained, the acquisition stops and the signal proceeds to the signal processing module for feature extraction. The clinical parameters of the study group are shown in Table 2.

Signal processing module
This section summarizes the different signal processing algorithms used for computing features related to OSA from PPG signal. Signal duration of more than a minute comprising of at least 15 frames of consecutively recorded good signal is given as input to the signal processing module. The output of this module is a vector comprising of an aggregate of all the features and fed as input to the machine learning module. The length of the signal and the number of samples selected are sufficient enough to compute heart rate, respiration rate and their variability respectively.

Estimation of respiratory rate
Estimation of respiratory rate from PPG signal is an alternative method to using a separate sensor-amplifier unit. Several signal processing techniques which work by extracting the respiration trend embedded in the PPG waveform have been put forth [25], [26]. It is a well-known fact that the bio signals are highly non-linear and non-stationary, nevertheless, test results reveal that the empirical mode decomposition (EMD) algorithm is best suited for non-linear signals and is capable of accurately estimating the respiratory rate from PPG signal. As our aim is to develop a system which automatically screens for OSA patients, the measurements are associated to the individual's hemodynamics and cardiovascular activity. Hence, efforts to accurately measure heart rate and respiratory rate from PPG signal have been carried out as they are the parameters closely associated with OSA and has been published in our previous work [27].

Heart rate variability and oxygen saturation
Several studies proved that there is a strong evidential relation between OSA and heart rate variability which is recorded as the variation in the time interval between heartbeats. In this work, the variability in peak-to-peak interval of PPG signal has been computed using slope detection algorithm. The peaks were detected from the zero crossings of the signal, rather than measuring the signal peaks. Initially a band pass filter was used to attenuate the dichrotic notch effect and to remove the mean. This method was found to be accurate in measuring the time interval between the alternate zero crossings and the difference in the time interval gives the measure of heart rate variability. The mean value of the peripheral oxygen saturation range was directly given by the USB based finger-tip pulse oximeter over the entire recording duration. The feature vector was finally constructed by clustering all the above mentioned features, including physiological parameters, morphological parameter of the PPG waveform, Teager-Kaiser energy level, and entropy of the signal, heart rate, respiration rate, heart rate variability and oxygen saturation. The statistical features such as mean, M and standard deviation, SD were also calculated in order to capture the relationship between heart rate variability and OSA. The output of the signal processing module is fed as input to the classifier module, which is trained with all the features of the output vector as given below:

Classifier module
The machine learning module is supposed to infer a physiological function that relates the features extracted from the PPG signal and the desired target [28]. Literature clearly portrays that most of the work involved in the detection of OSA has used neural network-based classifier [29]. It is also understood that deep learning neural networks have always outperformed the shallow neural networks and convolutional neural network is the widely used classifier in the recent years [30], [31]. In one of the approaches to detect respiratory arrests in OSA patients using PPG signal, classifiers like k-nearest neighbours classification, radial basis function neural network, probabilistic neural network, multilayer feedforward neural network and ensemble classification have been compared by the authors and their results depict a testing accuracy rate of 97.07% with multilayer feedforward neural network [4]. In our work three different machine learning algorithms were tested on the basis of flexibility, accuracy; smart enough to deal with noisy signals, non-requirement of repeated calibration and stability in terms of its performance. Moreover, the data type involved in our work is in the tabular form and it has facilitated easy implementation of the work using these algorithms. The structure of each classifier and method of training involved are described in this section.

Univariate regression (UR)
Univaraite regression also known as linear regression is a supervised machine learning algorithm used for data analysis. The relationship between a scalar dependent variable and one or more than one explanatory variable can be modeled by simple linear regression and multiple linear regression approaches respectively. Linear predictor functions, also called as linear models are used in modeling the data and to estimate the unknown model parameters from a set of data points. It is a method used to fit the best straight line between a set of data points, and the obtained straight line is used as a model to predict the value of variable from an input variable . The straight line represents the best estimate of the y value for every input of . General form of linear regression classifier is given in (1).
Where is the input value, is the slope of the best fit line and is the point where = 0 and intersects the -axis. The slope of the line can be calculated using (2).
Where ̅ and ̅ are the mean of independent variable and dependent variable respectively and and are the values of independent variable and dependent variable respectively. This attempt was made to model the correlation between the set of 24 features, both demographic and morphological features, extracted in the previous modules, which are taken to be the independent variables and the apnea-hypopnea index (AHI) as the dependent variable. From several researches it is made clear that the OSA severity can be defined based on AHI index and is recommended by the American Academy of sleep medicine. It varies from mild (5≤AHI≤15 events/hour), moderate (15≤AHI≤30 e/h) to severe (AHI≥30 e/h). Independent of the associated symptoms, OSA can be diagnosed in patients with a frequency of obstructive respiratory disturbances greater than 15 e/h. Linear regression serves to interpret the functional relationship between AHI and each of the features individually, and then predict the future value of the target variable, i.e., AHI in our case, based on the relationship interpreted. The algorithm for linear regression was implemented in matlab and the best fit straight-line equations relating various features with AHI were obtained from the linear regression plot. The relation between each of the features as mentioned in Table 1, with AHI was interpreted using linear regression model. Perhaps it was labour-intensive to predict the y value for each individual feature contained in the feature vector XF, from the signal processing module. The analysis of the observations revealed that few of the morphological features viz., systolic peak, diastolic peak, augmentation index, tpi, tpp extracted from the PPG signal had a close relationship with the target variable and had good prediction accuracy. The equations relating systolic peak (x) with AHI and augmentation index (y/x) with AHI obtained from the linear regression plot is given in (3)  The linear regression classifier was selected because of its simplicity in programming and ease of predicting numerical values. It was able to predict the severity of OSA over a wide range and the predicted values were close to the reference values with a highest error percentage of ±7 as shown in Table 3. The major disadvantage with such a linear predictor is that the inability to confirm the diagnosis of OSA in a patient on the basis of a single feature parameter. If required to include all the features, it would be a labourintensive and time-consuming process.

Multivariate regression (MR)
In order to overcome this snag of predicting AHI index for each individual feature, we attempted to implement the statistical technique: multiple linear regression/multivariate regression, which use several independent/predictor variables. It creates a linear relationship in the form of a straight line that best approximates all the individual data points and also helps to determine the potential variables that can be important predictors for a given dependent variable, AHI in our case. The general model of multivariate regression for n variables is given by (5).
where b1 to bn are the regression coefficients representing the value at which the dependent variable changes when the independent variable changes. The equation relating systolic peak (x), peak to peak interval (tpp) and augmentation index (y/x) with AHI obtained from the multiple linear regression plot is given in 6.
= (1.1471) * + (20.7108) * + (13.1487) * + 1.098 (6) Combinations of morphological, spectral and statistical features were chosen on a trial-and-error basis and all possible combinations were practically implemented. As displayed in Table 4, respiration rate and heart rate combined with few morphological features of PPG depicted the least error percentage. Yet, the predicted apnea-hypopnea index for OSA patients showed lesser accuracy over a wide range. Apart from selecting the right combination of the features which was a labor-intensive process, the computational time required is less compared to other machine learning algorithms.

Support vector machines (SVM)
A deep learning algorithm which performs supervised learning on a set of labeled training data is used [7], [26]. Based on the inference obtained from the training set, the algorithm constructs a set of hyper planes in a high dimensional space where the non-linear n-dimensional input vector is mapped into a K-dimensional feature space via a kernel trick. SVM uses a kernel function to transform the input data into a higher dimension space and then classify the data into two classes by constructing a separating hyper plane in the transformed space. The data vectors nearest to the hyper plane in the transformed space are the support vectors. Each input data vector is mapped into a higher dimension feature space K via a non-linear mapping Φ(x). In (7) define the equation of the hyper plane separating the data into two different classes.
where W = [w0, w1, w2,….., wK] is the weight vector. Linear kernel and second-order polynomial kernel functions were experimented with different values of the regularization parameter C (C=0.1, 1, 5, 8) as shown in Table 5. The value of parameter C has been identified on the basis of lowest misclassification rate on the testing dataset and the best suited configuration for the polynomial kernel functions were established based on independent trials using the testing data. The algorithm was implemented using SVM toolbox for Matlab. As respiration rate and oxygen saturation features are closely associated with OSA, classification was done for these two features separately and a classification using all the morphological features combined was also executed. Table 5 shows the classifier results of respiratory rate, SpO2 and the combined feature set. The polynomial kernel depicted higher accuracy, sensitivity and specificity compared to the linear kernel classification. SpO2 had the highest sensitivity and accuracy while respiratory rate had the highest specificity. The performance of the classifier for separate feature had shown better results than the combined feature set. In general, the accuracy increased with increase in C value. The highest performance (Acc: 96.5%, Sen: 98.0%, Spec: 99.3%) was achieved with the second-order polynomial with C=10 for SpO2. The classification of apneic and normal signals based on one single feature cannot be incorporated into automatic screening algorithms.

Random forest (RF)
RF is a supervised machine learning technique used for classification and regression. It operates by creating decision trees based on the feature parameters during the training process and acquire the prediction from each of the parameter [28]. The precision of the result increases with increase in the number of trees and also avoids over fitting of the model. The algorithm works in two stages, one is the construction of the forest which is completely a random process and the other is to make a prediction from the classifier formed in the first stage. Initially the algorithm randomly selects "m" features from a total of "n" features, where m<<n. Using the best split point, nodes and daughter nodes are formed among the randomly selected m features. Table 6 depicts the performance of random forest algorithm for morphological, statistical and combined features respectively using 20 decision trees. The random forest is constructed by repeating the above steps until the required number of nodes and trees are formed. The random forest algorithm predicts the output by analyzing the test features and the rules of each of the decision trees and stores the predicted output. After computing the votes for each of the predicted target, the algorithm identifies the target with the highest voting as the final predicted output. In the TELKOMNIKA Telecommun Comput El Control proposed work, the input feature set includes a total of 24 features and 4000 data samples of PPG signal. The number of decision trees chosen initially was 10 and later 20. The performance of the algorithm was examined for morphological features separately, statistical and spectral features separately and finally for all the combined features. The highest performance (Acc: 98.0%, Sen: 98.6%, Spec: 99.3%) was portrayed for a random forest of 20 decision tress, trained with the entire feature vector. The output of the classifier is an aggregation of the outputs of all the decision trees of the forest, which reduces the variance and bias. The execution time of the algorithm was very high compared to other regression algorithms, as the classifier first performs selection of random samples from the given feature vector XF, followed by construction of a decision tree for every sample. Predicted results are obtained from each decision tree and the computation required here are merely comparisons of one feature at each node of the tree. By introducing complete randomness in the selection of the sample feature at every node of the decision tree, and as the classifier technique makes comparison between only one feature at each node, the system is more powerful in finding the correlations of the inputs without the need for input scaling. The performance of the algorithm was excellent with a low error rate. The algorithm was configured based on the number of features compared at each node of the tree and the number of decision trees and implemented through Matlab.

RESULTS AND ANALYSIS
The experimental work involved first of identifying the best classifier algorithm for the good signal detection module. Secondly, the selection of features that could maximize the performance of the machine learning algorithms was an important criterion. Though age, weight, body mass index were included in the feature vector, they were non-predictive parameters and were used only to improve the accuracy of the system. The overall efficiency of the classifier algorithms is defined based on its ability to distinguish the normal and the abnormal signals correctly. The sensitivity of the classifier is defined as the percentage of apneic signals correctly classified and specificity of the classifier is the percentage of normal signals correctly classified.
Univariate and multivariate regression serve to interpret the functional relationship between AHI and each of the features individually and in groups respectively, and then predict the future value of the target variable. The former algorithm was selected for its simplicity and its predicting accuracy, with the highest error percentage of ±7. The predicted outputs for RR, tpp, tpi, HRV were very close to the target. The algorithm depicted consistency of the predicted outputs over a wide range of signals. The only disadvantage was that, to include all the morphological, statistical and spectral features was a labor-intensive process. This was overcome by using the multivariate regression as it served to interpret the functional relationship between AHI and several of the features added together, and then predict the future value of the target variable. The algorithm showed much non-linearity between normal and apneic signals and portrayed over-fitting for several normal PPG signals. Support vector machine worked relatively well compared to the above technique as in Table 7. Oxygen saturation features provided a clear margin of separation between the normal and the apneic signals. The highest performance (Acc: 96.5%, Sen: 98.0%, Spec: .3%) was achieved with the second-order polynomial with C=10 for SpO2. The second order polynomial kernel depicted higher accuracy, sensitivity and specificity compared to the linear kernel classifier. SpO2 had the highest sensitivity and accuracy while respiratory rate had the highest specificity. The performance of the classifier for separate feature had shown better results than the combined feature set. In general, the accuracy increased with increase in C value.
The reliability and stability of the machine learning algorithms were examined by cross-validating their performance on the same database for multiple times. The entire dataset was shuffled randomly and was split into 10 groups and a 10 cross-fold validation was performed. The performance of the random forest algorithm examined all the combined features depicted excellent accuracy as seen in Table 8. The highest performance (Acc: 98.0%, Sen: 98.6%, Spec: 99.3%) was portrayed for a random forest of 20 decision tress, trained with the entire feature vector. Increase in the number of decision trees beyond 20 showed no enhancement in its performance. The performance was better (Acc: 96.8%, Sen: 95.9%, Spec: 95.9%) with 10 decision trees for the morphological features only and the computation time taken was also very less, but again a decision cannot be made based on only few parameters.
The computation time required was more for the combined feature vector model, but the prediction accuracy was also the highest. Out of sample testing and cross validation methodology was used for tuning the machine learning algorithms to improve their performance. As shown in Tables 7 and 8, the random forest algorithm gave better results compared to other techniques, with the perfect set of features used and with best structure devised. Random forest depicted very less variability indicating better stability of the method. The other machine learning algorithms showed much variability when used on the testing database again.

CONCLUSION
This experimental study shows that it is possible to screen for OSA using PPG signals by taking in to account all the physiological variables that could be extracted from the PPG signal. The result of the system is very promising with better accuracy. The system does not require any calibration with change in subjects or over time. The system is also reliable even under extreme physiological conditions or hypoventilation and the response is linear with the parameters to be estimated. The system uses a single sensor and hence would provide a simple and robust screening of OSA, overcoming the discomfort of PSG which is an overnight sleep study. The complexity issue of the proposed approach counts on the computational resources required to implement the proposed algorithms, as it employs a standard PC programmed with a graphical programming tool, LabVIEW and Matlab. The computational time required for the system is widened in case of low PPG signal level, mainly due to hand movements to which the pulse oximeter is attached. We propose to further work on improving the accuracy of the system by training the algorithm with a greater range of OSA database, so that the system could be beneficial for extreme cases of OSA. Additionally, the system will be incorporated with state-of-the-art techniques to measure the physiological parameters which are exclusively related to OSA.