Gaussian Mixture Model
What is a Gaussian Mixture?
It is function composed by several Gaussians. The number of Gaussians is equal to the number of clusters (k). Each distribution is parametrized by:
The mean which defines its centre.
The covariance Σ which defines its width.
The mixing probability .
The Gaussian density function is given by:
where:
The sum of every mixing probability must be equal one:
is the number of dimensions or features of each instance.
Each data point represents a data point (a vector)
The mean is a vector.
And the covariance Σ is a matrix.
How do we fit the algorithm?
By applying the Expectation-Maximization algorithm widely used for optimization problems where the objective function is that complex.
Differences regarding k-Means
It accounts for covariance, which determines the shape of the distribution This means that meanwhile the k-means model is that it places a circle (or a hyper-sphere) at the center of each cluster, a GMM model can handle different shapes.
k-Means performs a hard classification, but a GMM model carries out a soft one by returning the probability that each data point belongs to a certain cluster.
Last updated