Technical Docs | Correlation

Correlation exposes the relationship between variables. We can use correlation to understand whether changing the values for variable A is likely to also change the values for variable B.

In Graphext, use the Correlations panel of your projects to examine the correlations between variables in your data. This article explains how correlation works, the different types of correlation alongside pointing out how to measure degrees of correlation in Graphext.

‍

‍

"Just because two variables have a statistical relationship with each other does not mean that one is responsible for the other. For instance, ice cream sales and forest fires are correlated because both occur more often in the summer heat. But there is no causation; you don't light a patch of the Montana brush on fire when you buy a pint of Haagan-Dazs."

- Nate Silver, The Signal and the Noise

‍

What is Correlation?

Correlation is a statistical concept referring to the relationship - or association - between two variables. Using correlation, we can measure the degree to which two variables move in relation to one another.

For instance, a correlation between variable A and variable B would suggest that changing the values belonging to variable A would affect the values of variable B in some way.

‍

Using correlation, we can measure the degree to which the values belonging to two variables move in relation to one another.

‍

Positive & Negative Correlation

The way that the values of a variable are affected by changes in the values of a correlated variable relates to whether the correlation is positive or negative.

‍

Positive Correlation

Positive correlations refers to a relationship between two variables in which both variables move in the same direction. For instance, if A is positively correlated with B then an increase in values belonging to A is likely to be associated with an increase in the values of B.

In Graphext Correlation charts, a strong positive correlation would be signified by a trend of big & bright circles moving diagonally upwards from left to right πŸ“ˆ

‍

Negative Correlation

Negative correlations refer to a relationship between two variables in which an increase in one variable is associated with a decrease in the other. For instance, if A is negatively correlated with B then an increase in values belonging to one of these variables is likely to be associated with a decrease in the values belonging to the other.

A strong negative correlation would be signified by a trend of big & bright circles moving diagonally upwards from right to left πŸ“‰

‍

Degrees of Correlation

The strength of a correlated relationship between two variables ranges between perfect positive correlation and perfect negative correlation.

A perfect positive positive correlation means that 100% of the time, values belonging to the correlated variables will move together by the same amount and in the same direction. A perfect negative correlation means that 100% of the time values belonging to the two correlated variables will be exactly opposite.

‍

Inside Graphext

You can determine the strength of correlation in Graphext Correlation charts by examining the statistic in the top right of each card representing the relevance of the variable. Chart cards with higher relevance statistics and more white bars represent a stronger correlation.

‍

‍

Correlation charts are ordered in terms of their relevance. Relevance scores refer to the mutual information shared by the two variables rather than linear correlation. Mutual information is more powerful than linear correlation in detecting arbitrary associations between variables.

‍

Need Something Different?

We know that data isn't always clean and simple.
Have a look through these topics if you can't see what you are looking for.