The false positive rate (FPR) is the fraction of actual negatives that the classifier wrongly flags as positive:
The denominator is the total number of actual negatives — those the classifier correctly classified (TN) plus those it incorrectly flagged (FP).
FPR is the complement of specificity: . The two carry the same information; FPR is the more common form when discussing classifier behaviour at different thresholds.
FPR shows up most prominently as the x-axis of the ROC curve. The ROC curve plots FPR (x) against TPR (recall, y) as the classification threshold is varied. The curve traces from at threshold (nothing predicted positive) to at threshold (everything predicted positive). A perfect classifier hugs the top-left corner — high TPR with FPR near zero.
Why we plot FPR rather than specificity
The choice of FPR vs. specificity on the ROC axis is conventional. FPR has the advantage that both axes (FPR and TPR) increase as the threshold lowers — the curve is monotonically up-and-to-the-right. Plotting specificity on the x-axis would invert the direction (specificity decreases as the threshold lowers), which is less natural for tracing a curve.
Tradeoff at thresholds
Sweeping the threshold trades FPR against TPR:
- Lower threshold (eager to predict positive): more positives caught (TPR up) at the cost of more false alarms (FPR up).
- Higher threshold (conservative): fewer false alarms (FPR down) at the cost of missing some positives (TPR down).
Different applications pick different operating points on this curve. A spam filter operates at low FPR (don’t classify legitimate email as spam). A cancer screen operates at high TPR (don’t miss any cancers) even at higher FPR (more follow-up tests are acceptable).
In scikit-learn
FPR comes out of roc_curve along with TPR and thresholds:
from sklearn.metrics import roc_curve
fpr, tpr, thresholds = roc_curve(y_test, y_scores)The arrays are paired — fpr[i] and tpr[i] are the values at thresholds[i]. Plotting fpr vs. tpr gives the ROC curve.