The iron ML notebook
  • The iron data science notebook
  • ML & Data Science
    • Frequent Questions
      • Discriminative vs Generative models
      • Supervised vs Unsupervised learning
      • Batch vs Online Learning
      • Instance-based vs Model-based Learning
      • Bias-Variance Tradeoff
      • Probability vs Likelihood
      • Covariance vs Correlation Matrix
      • Precision vs Recall
      • How does a ROC curve work?
      • Ridge vs Lasso
      • Anomaly detection methods
      • How to deal with imbalanced datasets?
      • What is "Statistically Significant"?
      • Recommendation systems methods
    • Statistics
      • The basics
      • Distributions
      • Sampling
      • IQR
      • Z-score
      • F-statistic
      • Outliers
      • The bayesian basis
      • Statistic vs Parameter
      • Markov Monte Carlo Chain
    • ML Techniques
      • Pre-process
        • PCA
      • Loss functions
      • Regularization
      • Optimization
      • Metrics
        • Distance measures
      • Activation Functions
      • Selection functions
      • Feature Normalization
      • Cross-validation
      • Hyperparameter tuning
      • Ensemble methods
      • Hard negative mining
      • ML Serving
        • Quantization
        • Kernel Auto-Tuning
        • NVIDIA TensorRT vs ONNX Runtime
    • Machine Learning Algorithms
      • Supervised Learning
        • Support Vector Machines
        • Adaptative boosting
        • Gradient boosting
        • Regression algorithms
          • Linear Regression
          • Lasso regression
          • Multi Layer Perceptron
        • Classification algorithms
          • Perceptron
          • Logistic Regression
          • Multilayer Perceptron
          • kNN
          • Naive Bayes
          • Decision Trees
          • Random Forest
          • Gradient Boosted Trees
      • Unsupervised learning
        • Clustering
          • Clustering metrics
          • kMeans
          • Gaussian Mixture Model
          • Hierarchical clustering
          • DBSCAN
      • Cameras
        • Intrinsic and extrinsic parameters
    • Computer Vision
      • Object Detection
        • Two-Stage detectors
          • Traditional Detection Models
          • R-CNN
          • Fast R-CNN
          • Faster R-CNN
        • One-Stage detectors
          • YOLO
          • YOLO v2
          • YOLO v3
          • YOLOX
        • Techniques
          • NMS
          • ROI Pooling
        • Metrics
          • Objectness Score
          • Coco Metrics
          • IoU
      • MOT
        • SORT
        • Deep SORT
  • Related Topics
    • Intro
    • Python
      • Global Interpreter Lock (GIL)
      • Mutability
      • AsyncIO
    • SQL
    • Combinatorics
    • Data Engineering Questions
    • Distributed computation
      • About threads & processes
      • REST vs gRPC
  • Algorithms & data structures
    • Array
      • Online Stock Span
      • Two Sum
      • Best time to by and sell stock
      • Rank word combination
      • Largest subarray with zero sum
    • Binary
      • Sum of Two Integers
    • Tree
      • Maximum Depth of Binary Tree
      • Same Tree
      • Invert/Flip Binary Tree
      • Binary Tree Paths
      • Binary Tree Maximum Path Sum
    • Matrix
      • Set Matrix Zeroes
    • Linked List
      • Reverse Linked List
      • Detect Cycle
      • Merge Two Sorted Lists
      • Merge k Sorted Lists
    • String
      • Longest Substring Without Repeating Characters
      • Longest Repeating Character Replacement
      • Minimum Window Substring
    • Interval
    • Graph
    • Heap
    • Dynamic Programming
      • Fibonacci
      • Grid Traveler
      • Can Sum
      • How Sum
      • Best Sum
      • Can Construct
      • Count Construct
      • All Construct
      • Climbing Stairs
Powered by GitBook
On this page

Was this helpful?

  1. ML & Data Science
  2. Computer Vision
  3. Object Detection
  4. Two-Stage detectors

Faster R-CNN

PreviousFast R-CNNNextOne-Stage detectors

Last updated 8 months ago

Was this helpful?

Main contributions with respect to Fast R-CNN,

  1. RPN or Region Proposal Network: NN to generate proposals for different anchor boxes.

  2. Uses anchor boxes (combination of scale and aspect ratio)

  3. CNN weights are shared by the RPN and the Fast R-CNN.

The Faster R-CNN works as follows:

  • The RPN generates region proposals. These regions are generated using a trainable network, that could be customized for each detection scenario. And uses the same convolutional layers used in the Fast R-CNN network.

  • For all region proposals in the image, a fixed-length feature vector is extracted from each region using the ROI Poolinglayer.

  • The extracted feature vectors are then classified using the Fast R-CNN.

  • The class scores of the detected objects in addition to their bounding-boxes are returned.

How are anchor boxes created?

The RPN works on the output feature map returned from the last convolutional layer shared with the Fast R-CNN. A sliding window of size NxN passes through the feature map. For each window, several candidate region proposals are generated. These proposals are not the final proposals as they will be filtered based on their “objectness score”.

Normally, they are defined by two parameters. Combining these parameters we can obtain K number of anchor boxes:

  • Scale

  • Aspect Ratio

For each window, K proposals are generated and a feature vector of equal size is extracted. Then, it's fed into two FC layers:

  • The cls layer generates the objectness score for every region proposal as a binary classifier (background vs object).

  • The reg layer returns a 4-D vector with the bbox of the region.

Given the Objectness Score, Faster R-CNN computes these two classes based on it: