How does a ROC curve work?

Overview

  • ROC (Receiver Operating Characteristics) is a probability curve.

  • AUC (Area Under the Curve) represents the degree or measure of separability.

  • AUC-ROC is a performance measurement for classification problems at various thresholds settings.

  • Range: [0, 1]

    • Best: 1

    • Worst: 0.5

    • Inverse: 0

A ROC curve plots the TPR on the y-axis versus the FPR on the x-axis. The TPR is the recall and the FPR is the probability of a false alarm.

Understanding the probability curves

When two distributions overlap, we introduce type 1 and type 2 errors. Depending upon the threshold, we can minimize or maximize them. A threshold equal to 0.5 will imply the metric we give an equal weight to the sensitivity and specificity of the model.

When we decrease the threshold, we get more positive values thus it increases the recall and decreasing the specificity. Similarly, when we increase the threshold, we get more negative values thus we get higher specificity and lower recall.

In the ideal situation, the distribution curve of the positive class is equal to the distribution of the negative one.

Finally, we can quantify a model’s ROC curve by calculating the total Area Under the Curve (AUC).

To make this clear:

  • Smaller values on the x-axis of the plot indicate lower false positives and higher true negatives.

  • Larger values on the y-axis of the plot indicate higher true positives and lower false negatives.

Dealing with multiclass models

Using the One-vs-All methodology, we can plot N AUC-ROC curves for the given N classes.

Last updated