In statistics, residual variance is another name for unexplained variation, the sum of squares of differences between the y-value of each ordered pair on the regression line and each corresponding predicted y-value; it is generally used to calculate the standard error of estimate. In other words, residual variance helps us confirm how well the regression line that we constructed fits the actual dataset. The smaller the variance, the more accurate the predictions are.

- Skill level:
- Easy

### Other People Are Reading

## Instructions

- 1
Review your given dataset and create a two-column table depicting corresponding x and y values. You may use a pen and paper, a table in a Word document, or an Excel spreadsheet. Start with the lowest given x-value and continue in ascending order.

- 2
Create the equation of the regression line based your dataset. In its generic form, the equation is y~=mx+b, where y~ is the predicted y value for a given x value, m is the slope and b is the y-intercept. Find the slope m and the y-intercept b and input results into the equation:

M=n'xy-('x)('y)/n'x^2-('x)^2 and b='y/n-m'x/n, where 'y/n is the mean value of the y-values in the dataset and 'x/n is the mean value of the x-values. Review the resulting equation of the regression line.

- 3
Calculate the unexplained deviation for each ordered pair (xi, yi). The regression line expresses the best possible prediction of y, given x, but most of the time there is a variation of data points around the regression line. The deviation of a particular data point (xi, yi) from its predicted value on the regression line (xi, yi~ ) is called the residual value. Use the following formula to calculate the unexplained deviation: yi-yi~

- 4
Calculate the residual variance. Residual variance is the sum of squares of differences between the y-value of each ordered pair (xi, yi) on the regression line and each corresponding predicted y-value, yi~. Use the following formula to calculate it: Residual variance = '(yi-yi~)^2