計算R中的AUC？

給定一個分數向量和一個實際類標籤向量，如何計算R語言或簡單英語中二進制分類器的單數AUC度量標準？計算R中的AUC？

第9頁的"AUC: a Better Measure..."似乎需要知道類標籤，這裏是an example in MATLAB，我不明白

R(Actual == 1))

因爲R（不要與R語言混淆）是指一個向量，但用作函數？

來源

2011-02-04 Andrew

對於其他人誰不知道，顯然AUC是我用過的「區域在[受試者工作特徵（http://en.wikipedia.org/wiki/Receiver_operating_characteristic）曲線」 – Justin 2011-02-04 21:30:40

正如其他人所提到的，您可以使用ROCR軟件包計算AUC。使用ROCR軟件包，您還可以繪製ROC曲線，升力曲線和其他模型選擇度量。

您可以直接使用AUC等於真陽性得分大於真陰性的概率的事實，直接計算AUC而不使用任何包。

例如，如果pos.scores是含有分數的正例的矢量，並且neg.scores是包含負例子則AUC是由近似的矢量：

> mean(sample(pos.scores,1000,replace=T) > sample(neg.scores,1000,replace=T)) 
[1] 0.7261

將給AUC的近似。您也可以估算AUC的方差通過引導：

> aucs = replicate(1000,mean(sample(pos.scores,1000,replace=T) > sample(neg.scores,1000,replace=T)))

來源

2011-02-05 20:31:26 erik

對於我的測試數據集，除了我不需要安裝pROC外，您的複製值與@ jonw's（是0.8504，您的0.850591）非常相似。謝謝 – Andrew 2011-02-07 16:44:01

@Andrew @eric這是一個可怕的答案。您不*估計AUC的方差 - 您只估計重採樣過程的方差。要說服自己，試着改變`樣本`中的樣本數量......將它除以10，將方差乘以10.將其乘以10，並將方差除以10.這當然不是期望的計算行爲AUC的變化。 – Calimo 2014-02-14 12:25:21

The ROCR package將計算AUC其他統計中：

auc.tmp <- performance(pred,"auc"); auc <- as.numeric([email protected])

來源

2011-02-04 21:45:17 semaj

ROCR繪製性能，但我不知道它是如何計算「單數AUC指標」（來自原始問題）。 – Andrew 2011-02-07 16:17:32

`auc.tmp < - performance（pred，「auc」）; auc < - as.numeric（[email protected]）` – Itamar 2011-12-14 09:19:08

與包pROC可以使用函數auc()像是從幫助頁面下面的例子：

> data(aSAH) 
> 
> # Syntax (response, predictor): 
> auc(aSAH$outcome, aSAH$s100b) 
Area under the curve: 0.7314

link to pROC

來源

2011-02-04 21:51:00

我通常使用DiagnosisMed軟件包中的功能ROC。我喜歡它生成的圖形。 AUC與它的置信區間一起被返回，並且在圖上也被提及。

ROC(classLabels,scores,Full=TRUE)

來源

2011-02-05 08:50:34

截至2016年7月20日這個鏈接https://cran.r-project.org/web/packages/DiagnosisMed/index.html說`Package' DiagnosisMed「已從CRAN存儲庫中刪除。」 – arun 2016-07-20 20:58:27

我也很抱歉。 – 2016-07-21 06:28:05

除了Erik的響應線，你也應該能夠直接從pos.scores和NEG比較所有可能的對數值計算的ROC。得分：

score.pairs <- merge(pos.scores, neg.scores) 
names(score.pairs) <- c("pos.score", "neg.score") 
sum(score.pairs$pos.score > score.pairs$neg.score)/nrow(score.pairs)

肯定比樣品的方法效率較低或PROC :: AUC，但比以前更穩定，需要比後者少安裝。

相關：當我嘗試這個時，它給出了與pROC的值類似的結果，但不完全相同（關閉0.02左右）;結果更接近樣本方法，N很高。如果有人有想法，爲什麼我可能會感興趣。

來源

2013-01-15 14:10:55

無需任何額外的軟件包：

true_Y = c(1,1,1,1,2,1,2,1,2,2) 
probs = c(1,0.999,0.999,0.973,0.568,0.421,0.382,0.377,0.146,0.11) 

getROC_AUC = function(probs, true_Y){ 
    probsSort = sort(probs, decreasing = TRUE, index.return = TRUE) 
    val = unlist(probsSort$x) 
    idx = unlist(probsSort$ix) 

    roc_y = true_Y[idx]; 
    stack_x = cumsum(roc_y == 2)/sum(roc_y == 2) 
    stack_y = cumsum(roc_y == 1)/sum(roc_y == 1)  

    auc = sum((stack_x[2:length(roc_y)]-stack_x[1:length(roc_y)-1])*stack_y[2:length(roc_y)]) 
    return(list(stack_x=stack_x, stack_y=stack_y, auc=auc)) 
} 

aList = getROC_AUC(probs, true_Y) 

stack_x = unlist(aList$stack_x) 
stack_y = unlist(aList$stack_y) 
auc = unlist(aList$auc) 

plot(stack_x, stack_y, type = "l", col = "blue", xlab = "False Positive Rate", ylab = "True Positive Rate", main = "ROC") 
axis(1, seq(0.0,1.0,0.1)) 
axis(2, seq(0.0,1.0,0.1)) 
abline(h=seq(0.0,1.0,0.1), v=seq(0.0,1.0,0.1), col="gray", lty=3) 
legend(0.7, 0.3, sprintf("%3.3f",auc), lty=c(1,1), lwd=c(2.5,2.5), col="blue", title = "AUC")

enter image description here

來源

2013-09-28 21:01:36 AGS

從ISL 9.6.3 ROC Curves代碼相結合，與@J一起。 Won。對這個問題和其他幾個地方的回答，下面繪製了ROC曲線，並在曲線右下角打印了AUC。

probs以下是二元分類的預測概率的數值向量，test$label包含測試數據的真實標籤。

require(ROCR) 
require(pROC) 

rocplot <- function(pred, truth, ...) { 
    predob = prediction(pred, truth) 
    perf = performance(predob, "tpr", "fpr") 
    plot(perf, ...) 
    area <- auc(truth, pred) 
    area <- format(round(area, 4), nsmall = 4) 
    text(x=0.8, y=0.1, labels = paste("AUC =", area)) 

    # the reference x=y line 
    segments(x0=0, y0=0, x1=1, y1=1, col="gray", lty=2) 
} 

rocplot(probs, test$label, col="blue")

這給了這樣一個情節：

來源

2016-07-20 21:37:13 arun

我找到了一些解決方案在這裏的是緩慢的和/或混亂（和他們中的一些不正確處理的關係），所以我在我的R包mltools中編寫了我自己的基於data.table的函數auc_roc()。

library(data.table) 
library(mltools) 

preds <- c(.1, .3, .3, .9) 
actuals <- c(0, 0, 1, 1) 

auc_roc(preds, actuals) # 0.875 

auc_roc(preds, actuals, returnDT=TRUE) 
    Pred CountFalse CountTrue CumulativeFPR CumulativeTPR AdditionalArea CumulativeArea 
1: 0.9   0   1   0.0   0.5   0.000   0.000 
2: 0.3   1   1   0.5   1.0   0.375   0.375 
3: 0.1   1   0   1.0   1.0   0.500   0.875

來源

2016-09-19 02:13:56 Ben

當前最高票答案是不正確的，因爲它無視關係。當正面和負面分數相等時，AUC應該是0.5。下面是更正的例子。

computeAUC <- function(pos.scores, neg.scores, n_sample=100000) { 
    # Args: 
    # pos.scores: scores of positive observations 
    # neg.scores: scores of negative observations 
    # n_samples : number of samples to approximate AUC 

    pos.sample <- sample(pos.scores, n_sample, replace=T) 
    neg.sample <- sample(neg.scores, n_sample, replace=T) 
    mean(1.0*(pos.sample > neg.sample) + 0.5*(pos.sample==neg.sample)) 
}

來源

2017-01-04 07:45:04

計算R中的AUC？

回答

相關問題