## Q. How do I measure the relationship between two variables?

Covariance and correlation are used to measure the relationship between two random variables. Both are measures of linear dependence.

Suppose X and Y and two random variables. Covariance is calculated as follows:

$Cov(X,Y) = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{n-1 }$

Covariance by itself is not useful by itself because it depends on the units of X and Y. Correlation resolves this problem by standardizing the covariance so that it is unitless. Correlation is calculated as follows:

$Corr(X,Y) = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}} = \frac{\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_{i=1}^{n}{(x_i-\bar{x})^2\sum_{i=1}^{n}(y_i-\bar{y})^2}}}$

Correlation is always a value between -1 and 1. The number represents the strength of the relationship, and the sign represents the direction. A value of -1 means there is perfect negative correlation, and a value of 1 means there is perfect positive correlation. If the correlation is 0, then X and Y are uncorrelated. The leftmost plot has $$Corr(Y_1, Y_2) \approx 0$$, the middle plot has $$Corr(Y_1, Y_2) = 1$$, and the rightmost plot has $$Corr(Y_1, Y_2) = -1$$.

### Topics

• Last Updated Apr 23, 2021
• Views 2