Gaussian Mixture Model
Last updated
Last updated
It is function composed by several Gaussians. The number of Gaussians is equal to the number of clusters (k). Each distribution is parametrized by:
The mean which defines its centre.
The covariance Σ which defines its width.
The mixing probability .
The Gaussian density function is given by:
where:
The sum of every mixing probability must be equal one:
is the number of dimensions or features of each instance.
Each data point represents a data point (a vector)
The mean is a vector.
And the covariance Σ is a matrix.
By applying the Expectation-Maximization algorithm widely used for optimization problems where the objective function is that complex.
It accounts for covariance, which determines the shape of the distribution This means that meanwhile the k-means model is that it places a circle (or a hyper-sphere) at the center of each cluster, a GMM model can handle different shapes.
k-Means performs a hard classification, but a GMM model carries out a soft one by returning the probability that each data point belongs to a certain cluster.