README - Info file for ABOSA: automatic blood oxygen saturation analysis software (version 1.2.1) Please note that the original publication describing validation and structure of the ABOSA is for the version 1.1. Small changes (read the corresponding file) have been made since. Basic Information: This software scores desaturations and following recoveries (resaturations) from the oxygen saturation signal, and calculates numerous different parameters from these scorings. Oxygen saturation signals are fetched from EDF files or from the files in SleepLab Format (SLF). The software analyses all files in input folder path. The software outputs a single excel file which consists of all calculated parameter values for all analyzed recordings. In addition, the software outputs an excel file for every recording which consists of more detailed info for each scored event. Moreover, a .txt file is constructed for each recording which consists of additional information. The software is programmed with MATLAB (version 2022b). However, the software does not require MATLAB to be used. Before first use, MATLAB runtime will be installed automatically, if the computer does not have one installed already. This installation can take few minutes can and requires ~3Gb of disc space. ------------------------------------------------------------------------------------------------------------ How to use ABOSA software: 1) Select input folder: For EDFs: Select the folder which consists of ALL EDF files you want to analyze. NOTE! The EDF files must in a single folder, not in subfolders. For SLFs: Select the 'Series' folder which contains all the files in their own folders to be analyzed. SLF structure is described in: https://arxiv.org/pdf/2402.06702 Also, select the correct input filetype (EDF or SLF) from the button group. 2) Select output folder: Select the folder for the outputs. In this folder, 3 subfolders are created: i) for parameter value excel-file + FileInfo.txt file ii) for event data for each recording iii) for additional information for each recording. Folders are named based on the time when the analysis started. 3) OPTIONAL!: You can input hypnograms for each recording. For EDFs: Select a single folder consisting of all the hypnograms to be used for all the EDFs. Valid filetypes for hypnograms are .csv, .txt, .xlsx, and .xls. Each Hypnogram MUST BE named with the same name as the corresponding EDF file (but different filetype, see example below) Each Hypnogram must be a column vector, in which sleep stages are named as (lower/upper case alphabets do not matter): Wake = Wake, W, or 0 N1 = N1, NREM1, or 1 N2 = N2, NREM2, or 2 N3 = N3, NREM3, or 3 REM = REM, R, or 4 All other sleep stages in the hypnogram are not considered and are counted as "other". The length of the Hypnograms MUST BE equivalent to the length of the inputted signals, each 30second of signal needs 1 sleep stage. This means that, for example, if the signal length is 24550s, which corresponds to 818.333 sleep stages (30s for each stage), the inputted Hypnogram MUST consist of 818, 819, or 820 sleep stages. In case the hypnogram would be 820 epochs, the last epoch is simply ignored. ABOSA assumes that the first sleep stage starts from the beginning of the recording. Thus, if the scoring of the sleep stages is not started at the beginning of the recording, you might need to pad the hypnograms to be the correct length, by adding, for example, "unknown" sleep stages at the beginning and/or end (or in the middle) of the signals. Example Hypnograms named "Example_Patient123.txt" and "Example_Patient4800222PSG.csv" have been provided. Thus, corresponding EDF filenames for these Hypnograms are "Example_Patient123.edf" and "Example_Patient4800222PSG.edf". Note, the "Example_Patient4800222PSG.csv" Hypnogram has been padded with "Unknown" -values (at the beginning, in the middle, and at the end) to correspond to the length of the oxygen saturation signal (corresponding signals are not provided). For SLFs: Tick the box and select the hypnogram from the pop up window. Hypnogram names are searched from the first SLF file to be analyzed. Note, that all the hypnograms must be named exactly the same as the selected one. For each file to be analyzed, the hypnogram must be located in the 'Subject' folder as described in the SLF documentation. Subject folder can contain multiple different hypnograms, for example, manually scored one and automatically scored one, but you need to select which ones are used. Hypnograms must be in .json format. See SLF documentation and/or the example hypnogram provided how the sleep stages must be annotated in the .json file. Example hypnogram named "slf_hypnogram_manual.json" is provided. In SLF hypnograms, the adjacent same sleep stages can be marked as a single annotation so that the duration is the number of adjacent events multiplied by 30. So, for example, a wake period lasting 210 seconds (without any other sleep stage scored during this time period) could be marked as one wake epoch lasting 210 seconds. However, the duration of the epochs MUST ALWAYS BE dividable by 30, e.g. an annotation with a duration of 190seconds is not accepted. In addition, gaps in the hypnogram are not accepted. Furthermore, with SLF hypnograms, the hypnogram scoring can start later than the signal is started (signal start time is searched from annotations.json). In such cases, for the ABOSA-performed analysis, the oxygen saturation signal is truncated from the beginning so that the starting times of the signal match. THEREFORE!! The desaturation/recovery event start times, end times, and epoch numbers saved in EventData.xlsx file are FROM THE BEGINNING OF THE TRUNCATED SIGNAL, NOT FROM THE BEGINNING OF THE ORIGINAL SIGNAL as normally. Thus, it is IMPORTANT to know that if you need to match the ABOSA-scored events to the original signal later, you need to adjust the start/end times of the events accordingly. The amount of truncating is saved in extrainfo.txt file, which allows you to do this. Hypnogram cannot start earlier than the signal, this will cause an error. For both EDFs and SLFs: Note, if using data with old sleep stage scorings, i.e. sleep stages are: Wake, N1, N2, N3, N4, and REM, the user needs to modify the Hypnograms accordingly. (N4 stage is not considered by the software, it is counted as "other" if no modifications are made by the user) (Or if N4 is marked with '4' it is handled mistakedly as REM, and if REM is then marked with '5' it is counted mistakedly as 'other') 4) OPTIONAL!: You can input a single file consisting of analyses' start and end times for all recordings. Valid file type is .csv This file must have 3 "columns": i) Filename, must match the EDF filename or SLF Subject name! ii) Analyses start time, iii) Analyses end time. Times must be inputted in the HH:mm:ss -format, e.g. 23:33:30 Thus, a .csv file for 3 patients should look like (example): 4800005.edf,22:15:30,05:19:30 4800123.edf,00:24:30,06:54:30 4801111.edf,22:37:30,08:14:30 or subject_10001,22:15:30,05:19:30 subject_10002,00:24:30,06:54:30 subject_10003,22:37:30,08:14:30 Example Analysis Start/Stop file named "Example_AnalysisStartStop_3patients.csv" has been provided. NOTE! Recordings over 24h cannot be analyzed! In case there are issues in the formatting of the .csv file into correct format, one solution could be to take the example csv file and modify this file directly for your needs. 5) Minimum duration of the desaturation events (in seconds): Shorter events are not scored. Possible minimum durations are: 3, 4, 5, 6 ,7 ,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, and 60 seconds. Note! Minimum duration for the recoveries cannot be changed by the user. Maximum desaturation duration is fixed to be 180 seconds. Maximum recovery duration is fixed to be the minimum of 2 minutes or 2 times the duration of the corresponding desaturation. 6) Minimum transient drop of the desaturation events (in %): Less deep events are not scored. Possible minimum transient drops are: 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20 %. Note! Minimum depth of the recoveries cannot be changed by the user, this is fixed to be 2%. As no desaturations/recoveries are scored in the parts of the signal with values < 50% or >100%, the theoretical maximum transient drop is 50%. 7) Sleep time definition: User can select how the total sleep time (TST) is calculated (Artefact removal affects also, described below). "Whole recording" means that TST = length of the inputted signal. "Analysis start to stop" means that TST is calculated from Analyses start to Analyses stop. "Sleep onset to offset" means that TST is calculated from the first NREM/REM epoch to the last NREM/REM epoch (all wake epochs between are included in the TST). "Sleep only" means that only sleep stages scored as NREM/REM are calculated to TST. The TST definition also affects which scored events are counted. For example, if "Sleep only" is selected, only desaturations (and corresponding recoveries) which start in NREM/REM are counted (the ones starting in Wake are excluded). NOTE! "Analysis start to stop"-definition REQUIRES a valid row in Analysis start/stop file. With SLF files, "Analysis start to stop"- definition also requires valid 'attributes.json' located in the data array folder. Attributes.json file MUST have valid 'start_ts' field, e.g. "1985-01-01T21:59:59". "Sleep onset to offset" and "Sleep only" -definitions REQUIRE valid Hypnograms. 8) Artefact removal: Whether artefact parts of the oxygen saturation signal are excluded from the TST. "No" artefact parts are NOT excluded "Yes" artefact parts are excluded. Nevertheless, no events are scored in artefacts. Thus, "artefact removal" only potentially affects TST. Note that detected artefacts with a duration smaller than equal to 5 seconds are interpolated. Thus, events can be scored in these parts. These interpolated parts are also included in the TST. However, if >20% of the duration of the event is interpolated, and the duration of the interpolated part is >5 seconds, no event is scored. Also note that events can still be scored fairly close to the artefacts. Thus, if a signal has a lot of artefacts it might be worth to check if such recording and scorings is something that you want to analyze. The duration of artefacts can be checked from the extrainfo.txt file. 9) Plotting status: By selecting "Yes", user can plot the oxygen saturation signal and scored events. This will pop-up an external window. Note that the figure is overwritten after the next recording is analyzed and if multiple recordings are analyzed only the last ones figure is plotted. Thus, if the user wants to examine a certain recording, it is recommended to run the analysis only for a single recording (by placing the EDF file in its own folder or having SLF Series folder consisting of only 1 subject). Even if "Yes" is selected, the figures are not saved anywhere. If you are doing analyses for multiple files, it is NOT RECOMMENDED to plot signals and events as this might slow down the analyses, and as it is not possible to zoom-in/out the plots while the software is running. -------------------------------------------------------------------------------- Buttons: RUN: Click "RUN" to start the analyses. After clicking "RUN" the software asks the user to select the correct label for the oxygen saturation signal. User must select one Primary label from the predefined list. This list if searched from the first file to be analyzed. Additionally, user can add secondary labels in case some of the files to be analyzed do not have (or you suspect) the primary label. These labels can be written in the text box and pressing 'add' button, or by selecting under the 'common labels' list which consist of some common labels used for the oxygen saturation signals. NOTE, that the order of the labels matters! The primary label has the highest priority after which the labels are checked in the order they appear on the secondary list. So, for example, if there is a patient that has oxygen saturation measured twice with labels 'SpO2' and 'SAO2', and both of these labels are inputted, the signal with label inputted with higher priority is used and other is completely ignored. Also, note that the capitalization of the labels do matter! 'SpO2' is not the same as 'SPO2'! Also, it is the user's responsibility to know that the inputted signal is real oxygen satururation signal. Sometimes the signals can be labeled non-intuitively. For example, a signal with label 'Saturation' could only indicate whether the signal quality has been determined to be OK by marking each data point as 0 or 1, and obviously, inputted such signal as oxygen saturation signal into ABOSA does not make sense. If none of the inputted labels are found in some of the files, such files are not analyzed. You can check which file was used to search labels for the predefined primary list from the FileNotes.txt file (located in ParameterValues -folder). STOP: Terminates the software. Calculated parameter values so far are saved in the excel file. NOTE! The software will finish the analyses of the recording that is in progress before terminating. Thus, it can take some time before "Code was terminated" text is popped up in the text box even after STOP button was pushed. Reset: Resets the settings. -------------------------------------------------------------------------------- OUTPUTS: ParameterValues excelfile: This is a single excel file in which all parameter values calculated from the EDF files are combined. First column is the filename. Second column is the total sleep time (which depends on the TST definition and artefact removal status). n_desat and n_reco are the number of scored desaturation and recovery events, respectively (note that a recovery is always associated with a desaturation but a desaturation can exist without a recovery) ODI = Oxygen desaturation index DesSev = Desaturation Severity (see Kulkas et al. 2013, Novel parameters for evaluating severity of sleep disordered breathing and for supporting diagnosis of sleep apnea-hypopnea syndrome. Journal of Medical Engineering and Technology, https://doi.org/10.3109/03091 902.2012.754509) DesSev100 = Desaturation Severity calculated from 100% oxygen saturation baseline DesDur = Desaturation Duration avg_des_dur = average duration of the desaturation events avg_des_area = average area of the desaturation events avg_des_area100 = average area of the desaturation events calculated from 100% baseline avg_des_slope = average slope (fall rate) of the desaturation events avg_des_depth = average depth of the desaturation events avg_des_max = average oxygen saturation value from which the desaturations start avg_des_nadir = average nadir oxygen saturation value during the desaturation events med_ ... are the same parameters but instead of averages, these are medians. RI = Recovery index (same as ODI but for recoveries) RecSev = Recovery Severity RecSev100 = Recovery Severity from 100% baseline RecDur = Recovery Duration avg_ ... same as for desaturations med... same but instead of averages, these are medians. (NOTE! Only desaturation-recovery event pairs that contain both events are used to calculate the ratio parameters (i.e. desaturations that are not followed by recovery are ignored)) avg_duration_ratio = average ratio of desaturation and recovery event durations avg_depth_ratio = average ratio of desaturation and recovery event depths avg_area_ratio = average ratio of desaturation and recovery event areas avg_area100_ratio = average ratio of desaturation and recovery event areas calculated from 100% baseline avg_slope_ratio = average ratio of desaturation and recovery event slopes med_XXXX_ratio = same as avg_XXXX_ratios, but using median instread of average. TotalSev_integrated = Total Severity of desaturation and recovery events. Total event area (desaturation+recovery) is calculated by integrating from the start of desaturation to the end of recovery. TotalSev_block = Total Severity of desaturation and recovery areas by summing individual desaturation and recovery areas. See EventData -> total_area_block below (rougher estimate than integration). TotalSev100 = Total Severity of desaturation and recovery events calculated from 100% baseline. TotalDur = Total Duration of desaturation and recovery events (i.e. DesDur + RecDur) t100 = Percentual time during TST where oxygen saturation values are <100%. t98 = Percentual time during TST where oxygen saturation values are <98%. t95, t92, t90, etc. similarly (Note that the next 6 parameters are significantly affected if "artefact removal" is "No", some of the parameter values can be nonsense if the data includes a lot of artefacts) avg_spO2 = average oxygen saturation value during TST med_spO2 = median oxygen saturation value during TST max_spO2 = maximum oxygen saturation value during TST nadir_spO2 = nadir oxygen saturation value during TST variance_spO2 = variance of the oxygen saturation signal during TST. total_area_below100 = area between 100% baseline and oxygen saturation signal during TST FileNotes.txt file: Contains information from which file the primary oxygen saturation label was fetched and lists all the inputted secondary labels. In addition, contains information if/why the analyses of certain files failed. - - - - - EventData: Each analyzed file has its own excel file which consists of detailed info for all scored desaturation and recovery events. NOTE! Recoveries are scored based on desaturations. Thus, no recoveries can exist without desaturation. But a desaturation can exist without a recovery. epoch = epoch number in which desaturation event starts from the BEGINNING OF THE RECORDING (epoch = 30 seconds of the signal) Event_SleepStage = sleep stage in which desaturation starts (NaN if Hypnogram is not provided) desat_start_ind = start index of the desaturation from the BEGINNING OF THE RECORDING (e.g. not from the beginning of TST!) desat_end_ind = end index of the desaturation from the BEGINNING OF THE RECORDING (e.g. not from the beginning of TST!) IMPORTANT NOTE for 'event_sleepstage','desat_start_ind', and 'desat_end_ind' (and corresponding recovery inds)!!! In case of SLF hypnograms that start later than the signal, these indexes are from the BEGINNING OF TRUNCATED SIGNAL (=not from the beginning of original signal as otherwise = from the beginning of hypnogram)! The amount of potential truncating is saved in extrainfo.txt file. desat_dur = duration of the desaturation (in seconds) desat_max = oxygen saturation value from which the desaturation event starts desat_nadir = oxygen saturation value to which the desaturation event ends desat_depth = depth of the desaturation (from max to nadir) desat_area = desaturation area (i.e. area between the baseline oxygen saturation value, i.e. desat_max, and the oxygen saturation signal) desat_area100 = desaturation area from 100% baseline (i.e. area between the 100% baseline, and the oxygen saturation signal) desat_slope = slope (fall rate) of the desaturation (unit is %/s), calculated based on the desat_max and desat_nadir. reco_ ... same as for desaturations duration_ratio = ratio of desaturation and recovery event durations (desat_dur/reco_dur) (NaN if desaturation is not followed by recovery) depth_ratio = ratio of desaturation and recovery event slopes (desat_depth/reco_depth) (NaN if desaturation is not followed by recovery) area_ratio = ratio of desaturation and recovery event areas (desat_area/reco_area) (NaN if desaturation is not followed by recovery) area100_ratio = ratio of desaturation and recovery event areas calculated from 100% baseline (desat_area100/reco_area100) (NaN if desaturation is not followed by recovery) slope_ratio = ratio of desaturation and recovery event slopes (desat_slope/reco_slope) (NaN if desaturation is not followed by recovery) total_dur = sum of desaturation and recovery durations (if no scored recovery, this is the same as desaturation duration) total_area_integrated = area under the desaturation and recovery calculated by integrating from the start of desaturation to the end of recovery total_area100 = area under the desaturation and recovery (from 100% baseline) total_area_block = direct sum of desat_area and reco_area. Thus, this is a rough estimate of the total area, as the oxygen saturation value from which the desaturation starts and recovery ends are not necessarily the same. (However, if desaturation start value and recovery end values are exactly the same, this is the same as total_area_integrated) total_mark = marked as 1 if desaturation is followed by recovery, marked as 0 if no recovery is scored. interpolated_artefact_desat = marked as 1 if desaturation includes parts of linearly interpolated artefacts interpolated_artefact_desat_percentage = percentage of desaturation duration which is interpolated interpolated_artefact_desat_duration = duration of the interpolated part during the desaturation (in seconds) interpolated_artefact_reco = marked as 1 if recovery includes parts of linearly interpolated artefacts interpolated_artefact_reco_percentage = percentage of recovery duration which is interpolated interpolated_artefact_reco_duration = duration of the interpolated part during the recovery (in seconds) - - - - - ExtraInfo: Additional information min_dur = minimum desaturation duration criterion (selected by the user) min_depth = minimum transient drop criterion (selected by the user) plotting_status = selected plotting status (selected by the user) tst_decreased_due_artefacts = how much total sleep time is decreased because of the artefacts (in hours). This is 0 if "artefact removal" is "No". Interpolated artefacts are not included in this. interpolated_artefact_duration_seconds = Duration of the interpolated artefacts (in seconds) tst_artefact_definition = artefact removal -definition (selected by the user) tst_definition = total sleep time definition (selected by the user) Wake_tst = percentual time in wake during TST (NaN if Hypnogram is not provided, 0 if TST definition is "Sleep only") N1_tst = percentual time in N1 sleep stage during TST N2_tst = percentual time in N2 sleep stage during TST N3_tst = percentual time in N3 sleep stage during TST REM_tst = percentual time in REM sleep stage during TST Other_tst = percentual time in any other sleep stage during TST fs = sampling frequency of the oxygen saturation signal UsedSpO2Label = Used label for the oxygen saturation signal that was used to search the signal Number_Of_Truncated_Indexes = Number of indexes that were truncated from the beginning of signal in case SLF hypnogram started later than the signal. Interpolated_artefact_indexes = interpolated artefact indexes from the BEGINNING of the recording [start_index, end_index] Artefact_indexes = artefact indexes. Artefact indexes are marked as [start_index, end_index]. (additionally if hypnograms are inputted) hypnogram_input = "yes" if hypnogram was inputted, "no" if not hypnogram_input_valid = "yes" if the inputted hypnogram was valid, "no" if invalid. Additionally extrainfo can include some error messages. -------------------------------------------------------------------------------- -Written by Tuomas Karhu (tuomas.karhu@uef.fi), 8.10.2024