Metrics

In regression

circle-exclamation

In classification

The confusion matrix

Confusion Matrix is a performance measurement for machine learning classification where output can be two or more classes.

  • True positive

    • Predicted 1 \Rightarrow Actual 1

  • False positive

    • Predicted 1 \Rightarrow Actual 0

    • This is a Type-1 Error

  • False negative

    • Predicted 0 \Rightarrow Actual 1

    • This is a Type-2 Error

  • True negative

    • Predicted 0 \Rightarrow Actual 0

Cost matrix

A cost matrix (error matrix) is also useful when specific classification errors are more severe than others. The Classification mining function tries to avoid classification errors with a high error weight. The trade-off of avoiding 'expensive' classification errors is an increased number of 'cheap' classification errors.

Accuracy

From all the total samples, it measures the well-classified rate.

Accuracy=TP+TNTotalAccuracy = \frac{\text{TP} + \text{TN} }{Total}

Precision

From all the predicted positives, it measures the rate actually classified as positive.

circle-check

Recall

From all the actual positives, it measures the rate classified as positives values. As well. it is called Sensitivity or True Positive Rate (TPR).

circle-check

Minimizing False Positives

Specificity

From all the actual negative values, it measures the rate classified as negative.

Specificity=TNTN+FPSpecificity = \frac{\text{TN}}{\text{TN} + \text{FP}}

False Positive Rate (FPR)

From all the actual negative values, it measures the rate misclassified as negative. It the negated probability of specificity.

FPR=1Specificity=FPTN+FP\text{FPR} = 1-Specificity = \frac{\text{FP}}{\text{TN} + \text{FP}}

F-Score

F1-score might be a better measure to use if we need to seek a balance between Precision and Recall AND there is an uneven class distribution (large number of Actual Negatives).

F1=2PrecisionRecallPrecision+Recall\text{F1} = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

Last updated