The iron ML notebook
  • The iron data science notebook
  • ML & Data Science
    • Frequent Questions
      • Discriminative vs Generative models
      • Supervised vs Unsupervised learning
      • Batch vs Online Learning
      • Instance-based vs Model-based Learning
      • Bias-Variance Tradeoff
      • Probability vs Likelihood
      • Covariance vs Correlation Matrix
      • Precision vs Recall
      • How does a ROC curve work?
      • Ridge vs Lasso
      • Anomaly detection methods
      • How to deal with imbalanced datasets?
      • What is "Statistically Significant"?
      • Recommendation systems methods
    • Statistics
      • The basics
      • Distributions
      • Sampling
      • IQR
      • Z-score
      • F-statistic
      • Outliers
      • The bayesian basis
      • Statistic vs Parameter
      • Markov Monte Carlo Chain
    • ML Techniques
      • Pre-process
        • PCA
      • Loss functions
      • Regularization
      • Optimization
      • Metrics
        • Distance measures
      • Activation Functions
      • Selection functions
      • Feature Normalization
      • Cross-validation
      • Hyperparameter tuning
      • Ensemble methods
      • Hard negative mining
      • ML Serving
        • Quantization
        • Kernel Auto-Tuning
        • NVIDIA TensorRT vs ONNX Runtime
    • Machine Learning Algorithms
      • Supervised Learning
        • Support Vector Machines
        • Adaptative boosting
        • Gradient boosting
        • Regression algorithms
          • Linear Regression
          • Lasso regression
          • Multi Layer Perceptron
        • Classification algorithms
          • Perceptron
          • Logistic Regression
          • Multilayer Perceptron
          • kNN
          • Naive Bayes
          • Decision Trees
          • Random Forest
          • Gradient Boosted Trees
      • Unsupervised learning
        • Clustering
          • Clustering metrics
          • kMeans
          • Gaussian Mixture Model
          • Hierarchical clustering
          • DBSCAN
      • Cameras
        • Intrinsic and extrinsic parameters
    • Computer Vision
      • Object Detection
        • Two-Stage detectors
          • Traditional Detection Models
          • R-CNN
          • Fast R-CNN
          • Faster R-CNN
        • One-Stage detectors
          • YOLO
          • YOLO v2
          • YOLO v3
          • YOLOX
        • Techniques
          • NMS
          • ROI Pooling
        • Metrics
          • Objectness Score
          • Coco Metrics
          • IoU
      • MOT
        • SORT
        • Deep SORT
  • Related Topics
    • Intro
    • Python
      • Global Interpreter Lock (GIL)
      • Mutability
      • AsyncIO
    • SQL
    • Combinatorics
    • Data Engineering Questions
    • Distributed computation
      • About threads & processes
      • REST vs gRPC
  • Algorithms & data structures
    • Array
      • Online Stock Span
      • Two Sum
      • Best time to by and sell stock
      • Rank word combination
      • Largest subarray with zero sum
    • Binary
      • Sum of Two Integers
    • Tree
      • Maximum Depth of Binary Tree
      • Same Tree
      • Invert/Flip Binary Tree
      • Binary Tree Paths
      • Binary Tree Maximum Path Sum
    • Matrix
      • Set Matrix Zeroes
    • Linked List
      • Reverse Linked List
      • Detect Cycle
      • Merge Two Sorted Lists
      • Merge k Sorted Lists
    • String
      • Longest Substring Without Repeating Characters
      • Longest Repeating Character Replacement
      • Minimum Window Substring
    • Interval
    • Graph
    • Heap
    • Dynamic Programming
      • Fibonacci
      • Grid Traveler
      • Can Sum
      • How Sum
      • Best Sum
      • Can Construct
      • Count Construct
      • All Construct
      • Climbing Stairs
Powered by GitBook
On this page
  • In regression
  • In classification
  • The confusion matrix
  • Accuracy
  • Precision
  • Recall
  • Minimizing False Positives
  • F-Score

Was this helpful?

  1. ML & Data Science
  2. ML Techniques

Metrics

PreviousOptimizationNextDistance measures

Last updated 3 years ago

Was this helpful?

Sources:

In regression

TODO

In classification

The confusion matrix

Confusion Matrix is a performance measurement for machine learning classification where output can be two or more classes.

  • True positive

    • Predicted 1 ⇒\Rightarrow⇒ Actual 1

  • False positive

    • Predicted 1 ⇒\Rightarrow⇒ Actual 0

    • This is a Type-1 Error

  • False negative

    • Predicted 0 ⇒\Rightarrow⇒ Actual 1

    • This is a Type-2 Error

  • True negative

    • Predicted 0 ⇒\Rightarrow⇒ Actual 0

Cost matrix

A cost matrix (error matrix) is also useful when specific classification errors are more severe than others. The Classification mining function tries to avoid classification errors with a high error weight. The trade-off of avoiding 'expensive' classification errors is an increased number of 'cheap' classification errors.

Accuracy

From all the total samples, it measures the well-classified rate.

Accuracy=TP+TNTotalAccuracy = \frac{\text{TP} + \text{TN} }{Total}Accuracy=TotalTP+TN​
from sklearn.metrics import accuracy_score

y_pred = [0, 2, 1, 3]
y_true = [0, 1, 2, 3]
accuracy_score(y_true, y_pred)
>>> 0.5

Precision

From all the predicted positives, it measures the rate actually classified as positive.

It is a good measure to determine, when the costs of FP is high. For instance, email spam detection.

Precision=TPTP+FPPrecision = \frac{\text{TP}}{\text{TP} + \text{FP}}Precision=TP+FPTP​
from sklearn.metrics import precision_score

precision_score(y_true, y_pred, average='weighted')

Recall

From all the actual positives, it measures the rate classified as positives values. As well. it is called Sensitivity or True Positive Rate (TPR).

It is a good metric to select our best model when there is a high cost associated with FN. For instance, in fraud detection or sick patient detection.

Recall=TPR =Sensitivity=TPTP+FNRecall = \text{TPR } = Sensitivity = \frac{\text{TP}}{\text{TP} + \text{FN}}Recall=TPR =Sensitivity=TP+FNTP​
from sklearn.metrics import recall_score

recall_score(y_true, y_pred, average='weighted')

Minimizing False Positives

Specificity

From all the actual negative values, it measures the rate classified as negative.

Specificity=TNTN+FPSpecificity = \frac{\text{TN}}{\text{TN} + \text{FP}}Specificity=TN+FPTN​

False Positive Rate (FPR)

FPR=1−Specificity=FPTN+FP\text{FPR} = 1-Specificity = \frac{\text{FP}}{\text{TN} + \text{FP}}FPR=1−Specificity=TN+FPFP​

F-Score

F1=2⋅Precision⋅RecallPrecision+Recall\text{F1} = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}F1=2⋅Precision+RecallPrecision⋅Recall​
from sklearn.metrics import f1_score

f1_score(y_true, y_pred, average='weighted')

It may be computed using :

It may be computed using :

It may be computed using :

From all the actual negative values, it measures the rate misclassified as negative. It the negated probability of .

F1-score might be a better measure to use if we need to seek a balance between and AND there is an uneven class distribution (large number of Actual Negatives).

Understanding confusion matrix (Sarang Narkhede)
Accuracy, Precision, Recall or F1? (Koo Ping Shung)
Cost Matrix (IBM Knowledge Center)
scikit-learn
scikit-learn
scikit-learn
specificity
Precision
Recall