Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
444 views
in Technique[技术] by (71.8m points)

Calculating AUC on Random Forest Model in R

I'm trying to calculate AUC on my two models Random Forest and Naive Bayes but getting the same error ""$ operator is invalid for atomic vectors" . Would you have some ideas please ?

Background: Target variable "Diagnosis" is non-numerical with values B and M

Here is sample code for RF model

fitControl <- trainControl(method="cv",number = 5,preProcOptions = list(thresh = 0.4),classProbs = TRUE,summaryFunction = twoClassSummary)

wdbc_model_rf <- train(Diagnosis~.,train_wdbc,method="ranger",metric="ROC",preProcess = c('center', 'scale'),trControl=fitControl)
question from:https://stackoverflow.com/questions/65837149/calculating-auc-on-random-forest-model-in-r

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Below is an example of R code that works. Please note: your interest in ROC implies there are only two classes.

Predict <- function(class_obj, newdata, Param) {

if(Param$method == 'RF') {
    Predicted_Probs         <- predict(class_obj, newdata = newdata, type = 'prob')
} else if(Param$method == 'GBM') {
    Predicted_Probs         <- predict(class_obj, newdata = newdata, type = 'response', n.trees = Param$n.trees)[,,1]
} else if(Param$method == 'SVM') {
    Predicted_Probs         <- predict(class_obj, newdata = newdata, type = 'probabilities')
} else if(Param$method == 'logit') {
    Predicted_Probs         <- predict(class_obj, newdata = newdata, type = 'response')
    Predicted_Probs         <- cbind(1 - Predicted_Probs, Predicted_Probs)
} else { 
    cat('
Predict(): unknown classification method.')
}

Predicted_Probs[,2]

}

@@@

AUC <- function(Truth, Predicted_Probs) {

###########################################################################################################
# SETTINGS

d_Prob              <- 0.01

###########################################################################################################
# CALCULATIONS

Prob_Grid               <- seq(1, 0, -d_Prob)
NP                  <- length(Prob_Grid)
True_Positive_Rate      <- c()
False_Positive_Rate     <- c()

for(Prob_Threshold in Prob_Grid) {
    Forecast                <- as.factor( c(0, 1, 1 * (Predicted_Probs >= Prob_Threshold)) )
    levels(Forecast)            <- c('0', '1')
    Forecast                <- Forecast[-c(1,2)]
    Table               <- xtabs(~Truth + Forecast)
    False_Positive_Rate     <- c(False_Positive_Rate, Table[1,2] / (Table[1,1] + Table[1,2]))
    True_Positive_Rate      <- c(True_Positive_Rate, Table[2,2] / (Table[2,1] + Table[2,2]))
}

AUC                 <- 0

for(i in 2:NP) {
    AUC                 <- AUC + True_Positive_Rate[i] * (False_Positive_Rate[i] - False_Positive_Rate[i-1])
}

AUC

}

Please note: the code is quite generic and can be applied to many methods, like support vector machines, gradient boosting, random forests, etc. Hopefully, it is straightforward to modify the code to your needs.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...