Covariance vs Correlation Matrix
Overview
Covariance direction of the linear relationship between variables.
Correlation measure of the strength and direction of a linear relationship.
Correlation values are standardized whereas, covariance values are not.
Covariance Matrix
Focusing on the two-dimensional case, the covariance matrix for two dimensions (or and variables) is given by:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = (12, 8)
mean = 0
std = 1
num_samples = 500
x = np.random.normal(mean, std, num_samples)
y = np.random.normal(mean, std, num_samples)
X = np.vstack((x, y)).T # Join both arrays and transpose
# X = np.stack(arrays=[x, y], axis=1) # Equivalent transformation
plt.scatter(X[:, 0], X[:, 1])
plt.title('Generated Data')
plt.axis('equal');

Correlation Matrix
Unlike covariance, the correlation has an upper and lower cap on a range .
The correlation coefficient of two variables could be get by dividing the covariance of these variables by the product of the standard deviations of the same values.
import pandas as pd
data = np.random.RandomState(seed=0)
correlation = pd.DataFrame(data.rand(10, 10)).corr()
correlation.style.background_gradient(cmap='coolwarm')

Last updated
Was this helpful?