Question to be answered: FILL THIS OUT!

Model Summary

Total number of input variables: 5

Total number of rows in data set: 290

Number of rows in test set: 58

Size of dataset in memory: 0 Mb

Best Model: Random Forest with 200 trees


Algorithm Performance

Standard metrics for evaluating model performance as shown below.

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  N  Y
##          N  0  5
##          Y 47  4
##                                           
##                Accuracy : 0.0714          
##                  95% CI : (0.0198, 0.1729)
##     No Information Rate : 0.8393          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : -0.1925         
##  Mcnemar's Test P-Value : 1.303e-08       
##                                           
##             Sensitivity :  0.00000        
##             Specificity :  0.44444        
##          Pos Pred Value :  0.00000        
##          Neg Pred Value : -0.77049        
##              Prevalence : 48.00000        
##          Detection Rate :  0.00000        
##    Detection Prevalence :  0.08929        
##       Balanced Accuracy :  0.22222        
##                                           
##        'Positive' Class : N               
## 

Variable Importance

The graph below shows the relative importance of the predictor variables in the best model, sorted from highest to lowest. Variables with an importance score of zero won’t be included in the final model.

Tue Jul 19 13:53:36 2016
POWERED BY DATA SCIENCE @ HEALTH CATALYST


Machine Details

R version 3.2.4 Revised (2016-03-16 r70336)

Platform: x86_64-w64-mingw32

R Packages Required: HCRTools (caret, doParallel, e1071, pROC, R6, rmarkdown, ranger, ROCR, RODBC)


Model Comparison

Models compared: Logistic Regression (glm-logit), Random Forest (rf)