Data cannot be shared because of PPMI’s Data Usage Agreements, which prevent users to re-publish data. All data used in this study, as well as a data dictionary, are free and publicly available at the PPMI website, upon an online application, the signature of the Data User Agreement and of the publications policies.

This code is in R, version 4.3.2, and requires the caret, tidyverse, readr, pROC, ggplot2, ggthemes, and gridExtra packages.

Folder and File Names

1bm refers to the singular models that use 1 biomarker.
2bm refers to the combined models that use 2 biomarkers.
3bm refers to the combined models that use 3 biomarkers.
ROC refers to the code used to produce the ROC curves.
stats refers to the code used to run statistical analyses

In the 1bm, 2bm, and 3bm folders:

The first part of the file name refers to the biomarker(s): 
D = DaT-SPECT
S = Alpha-Synuclein
A = Beta-Amyloid
T = Total-Tau
P = Phosphorylated-Tau-181
N = Neurofilament Light

The second part of the file name refers to the purpose of the model:
NCvsMCI = Model aimed to detect mild cognitive impairment or distinguish MCI from normal cognition
HCvsPD = Model aimed to detect Parkinson's Disease or distinguish PD from healthy controls.

The third part of the file name refers to what subset of the data the model is implemented in:
HC = Only healthy controls
PD = Only patients with Parkinson's disease
NC = only patients with normal cognition
MCI = only patients with mild cognitive impairment
all = the entire cohort

An example file name is DT_NCvsMCI_HC. This model would aim to detect mild cognitive impairment in healthy controls using DaT-SPECT and total-tau.

Column Names

Column names of the data that are referred to in the code that you may need to replace for your own experiments:
SEX (Sex)
MCA1 (MoCA Score First Instance)
MCA2 (MoCA Score Second Instance)
EDUC (Years of Education)
RACE (Race)
FAMPD (Family History of PD)
HANDEDNESS (Handedness)
UPDRS (United Parkinson's Disease Rating Scale)
NHY (Hoehn and Yahr Score)
AGE (Age)
DisDur (Disease Duration)

Additional Information

When running the Random Forest models in the Singular Models folder, make sure to update the parameters for each step.

Multiple code files, such as the ROC codes, require the output of other files. To run these codes, first save the output as the required code as an R workspace file. Then, load that R workspace file as directed in the code you want to run.
