Z-score

Also known as standard score of an observation, this method assumes data follows a gaussian distribution.

It's a parametric method which indicates how many standard deviations an instance is from the sample’s mean.

The z-score of every data point is calculated using the formula: z=(xμ)/σz = (x-\mu)/\sigma. It can be easily calculated using the method provided by sklearn.

Once every z-score is computed, outliers are detected given a threshold. It's usually set to: 2.52.5, 3.03.0 or 3.53.5.

Last updated