the receiver operating characteristic (roc) curve
DESCRIPTION
The Receiver Operating Characteristic (ROC) Curve. EPP 245/298 Statistical Analysis of Laboratory Data. Binary Classification. Suppose we have two groups for which each case is a member of one or the other, and that we know the correct classification (“truth”). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/1.jpg)
1
The Receiver Operating Characteristic (ROC) Curve
EPP 245/298
Statistical Analysis of
Laboratory Data
![Page 2: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/2.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
2
Binary Classification
• Suppose we have two groups for which each case is a member of one or the other, and that we know the correct classification (“truth”).
• Suppose we have a prediction method that produces a single numerical value, and that small values of that number suggest membership in group 1 and large values suggest membership in group 2
![Page 3: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/3.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
3
• If we pick a cutpoint t, we can assign any case with a predicted value ≤ t to group 1 and the others to group 2.
• For that value of t, we can compute the number correctly assigned to group 2 and the number incorrectly assigned to group 2 (true positives and false positives).
• For t small enough, all will be assigned to group 2 and for t large enough all will be assigned to group 1.
• The ROC curve is a plot of true positives vs. false positives
![Page 4: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/4.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
4
datagen <- function(){ truth <- rep(0:1,each=50) pred <- c(rnorm(50,10,1),rnorm(50,12,1)) return(data.frame(truth=truth,pred=pred))}plot1 <- function(){ nz <- sum(truth==0) n <- length(truth) plot(density(pred[1:nz]),lwd=2,xlim=c(6,18), main="Generating an ROC Curve") lines(density(pred[(nz+1):n]),col=2,lwd=2) abline(v=10,col=4,lwd=2) abline(v=11,col=4,lwd=2) abline(v=12,col=4,lwd=2)}-----------------------------------------> source(“rocsim.r”)> roc.data <- datagen()> attach(roc.data)> plot1()
![Page 5: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/5.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
5
![Page 6: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/6.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
6
roc.curve <- function(truth,pred,maxx){ ntp <- sum(truth==1) ntn <- sum(truth==0) n <- length(truth) preds <- sort(unique(pred)) npred <- length(preds) tp <- vector("numeric",npred+1) fp <- tp fp[1] <- ntn tp[1] <- ntp for (i in 1:npred) { cutpt <- preds[i] tp[i+1] <- sum((pred >= cutpt)&(truth==1)) fp[i+1] <- sum((pred >= cutpt)&(truth==0)) } plot(fp,tp, type="l",lwd=2,xlim=c(0,maxx)) title("ROC Curve")}----------------------------------------> roc.curve(truth,pred,50)
![Page 7: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/7.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
7
![Page 8: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/8.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
8
datagen2 <- function(){ truth <- rep(0:1,c(990,10)) pred <- c(rnorm(990,10,1),rnorm(10,12,1)) return(data.frame(truth=truth,pred=pred))}--------------------------------------> detach(roc.data)> roc.data2 <- datagen2()> attach(roc.data2)> roc.curve(truth,pred,40)
![Page 9: The Receiver Operating Characteristic (ROC) Curve](https://reader036.vdocuments.net/reader036/viewer/2022082817/56812a98550346895d8e4d20/html5/thumbnails/9.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
9
ROC Curve for Rare Outcome