Researchers that collect quantitative, or numerically based, data perform various forms of statistical analysis to draw conclusions from this data. By looking for relationships between different data sets they collect, researchers can test hypotheses about how different factors affect one another, and how strong these effects are. One such method, testing for statistical correlations, has limited academic value.
Correlations are a simple form of statistical analysis that looks for numerical relationships between two equally sized data sets. By comparing numbers from two different data sets together, correlations look at how movement in the value of numbers in one data set is related to movement in the value of numbers in the other data set. For example, a researcher could look at correlations between the hours students spend studying and their test scores to see if there is a relationship between hours spent studying and test scores. The equation to test for correlations reports this relationship as a coefficient that is between zero (absolutely no relationship between the two data sets) and one (a perfect relationship between the two data sets) that is either positive (an increase in one data set is related to an increase in the other data set) or negative (an increase in one data set is related to a decrease in the other data set).
The main advantage of the method of looking for simple correlations between two data sets is that the equation for finding a correlation coefficient is simple enough for students to crunch the numbers by hand, rather than relying on computers or calculators for the analysis. This introduces students to the math behind statistical analysis, which builds a foundation for understanding the math behind more sophisticated methods of statistical analysis. The simple nature of a correlation also introduces students to the ideas behind statistical analysis (direction and magnitude of relationships).
Lack of Directionality
The primary disadvantage of correlations is that while they report relationships between data sets, they give no clue as to causality. Specifically, the math behind the correlation equation doesn't let researchers know which data set is responsible for the relationship correlation equations report. In the example of running a correlation between hours spent studying and test scores, it might be intuitive to think that a positive relationship between the two data sets is because of hours spent studying. However, as far as the math behind the correlation equation is concerned, there is no way of proving that the inverse---that getting higher test scores makes you study more---isn't true.
Correlations are bivariate in nature: they compare two numbers at a time from two different data sets. This only allows researchers to examine relationships between two factors at a time. However, this isn't realistic: there are almost always multiple relationships and effects on something. If a researcher wanted to examine interconnected relationships and effects, the correlation equation is mathematically incapable of accommodating such a research design. Regression analysis, however, allows a researcher to not only establish causality but to look at relationships between more than two data sets.