Total number of input variables: 5
Total number of rows in data set: 290
Number of rows in test set: 58
Size of dataset in memory: 0 Mb
Best Model: Random Forest with 200 trees
Standard metrics for evaluating model performance as shown below.
## Confusion Matrix and Statistics
##
## Reference
## Prediction N Y
## N 0 5
## Y 47 4
##
## Accuracy : 0.0714
## 95% CI : (0.0198, 0.1729)
## No Information Rate : 0.8393
## P-Value [Acc > NIR] : 1
##
## Kappa : -0.1925
## Mcnemar's Test P-Value : 1.303e-08
##
## Sensitivity : 0.00000
## Specificity : 0.44444
## Pos Pred Value : 0.00000
## Neg Pred Value : -0.77049
## Prevalence : 48.00000
## Detection Rate : 0.00000
## Detection Prevalence : 0.08929
## Balanced Accuracy : 0.22222
##
## 'Positive' Class : N
##
The graph below shows the relative importance of the predictor variables in the best model, sorted from highest to lowest. Variables with an importance score of zero won’t be included in the final model.
Tue Jul 19 13:53:36 2016POWERED BY DATA SCIENCE @ HEALTH CATALYST
R version 3.2.4 Revised (2016-03-16 r70336)
Platform: x86_64-w64-mingw32
R Packages Required: HCRTools (caret, doParallel, e1071, pROC, R6, rmarkdown, ranger, ROCR, RODBC)
Models compared: Logistic Regression (glm-logit), Random Forest (rf)