What is the difference between correlation and causation?
Answer
Correlation establishes that a relationship exists between two variables, while causation means that one event results in the occurrence of the other event. In other words, causation is a stronger statement than correlation, and correlation does not always result in causation.
Suppose you are a statistician studying the relationship between monthly ice cream sales and pool drownings. You notice that both variables increase during the summer months and are therefore correlated. Does this mean that eating ice cream causes people to drown in pools? Of course not! The increase in both variables can be more reasonably explained by warmer temperatures in the summer.
This example highlights the biggest challenge with causation. When you use observational studies (i.e. data collected in the “real world”), there are a lot of hidden factors outside of your control that can affect the results. This makes it challenging to say one event causes another. The best way to show causation is through a designed experiment where you can control all of the factors except the variables of interest. However, this is not always possible and more often than not we have to rely on observational data.