Find the number of pairs of observation from the following data: r= 0.50, , , where x = X − X and y = Y− Y .
Answers
Regression
A. Introduction to Simple Linear Regression
B. Partitioning Sums of Squares
C. Standard Error of the Estimate
D. Inferential Statistics for b and r
E. Influential Observations
F. Regression Toward the Mean
G. Introduction to Multiple Regression
H. Exercises
This chapter is about prediction. Statisticians are often called upon to develop
methods to predict one variable from other variables. For example, one might want
to predict college grade point average from high school grade point average. Or,
one might want to predict income from the number of years of education.
462
Introduction to Linear Regression
by David M. Lane
Prerequisites
• Chapter 3: Measures of Variability
• Chapter 4: Describing Bivariate Data
Learning Objectives
1. Define linear regression
2. Identify errors of prediction in a scatter plot with a regression line
In simple linear regression, we predict scores on one variable from the scores on a
second variable. The variable we are predicting is called the criterion variable and
is referred to as Y. The variable we are basing our predictions on is called the
predictor variable and is referred to as X. When there is only one predictor
variable, the prediction method is called simple regression. In simple linear
regression, the topic of this section, the predictions of Y when plotted as a function
of X form a straight line.
The example data in Table 1 are plotted in Figure 1. You can see that there is
a positive relationship between X and Y. If you were going to predict Y from X, the
higher the value of X, the higher your prediction of Y.
Table 1. Example data.
X Y
1.00 1.00
2.00 2.00
3.00 1.30
4.00 3.75
5.00 2.25
463
0
1
2
3
4
5
0 1 2 3 4 5 6
Y
X
Figure 1. A scatter plot of the example data.
Linear regression consists of finding the best-fitting straight line through the points.
The best-fitting line is called a regression line. The black diagonal line in Figure 2
is the regression line and consists of the predicted score on Y for each possible
value of X. The vertical lines from the points to the regression line represent the
errors of prediction. As you can see, the red point is very near the regression line;
its error of prediction is small. By contrast, the yellow point is much higher than
the regression line and therefore its error of prediction is large.