Bivariate linear regression

Yüklə 1,25 Mb.

1 2

e224n1208

Regression line

For a bivariate or simple regression with an independent variable x and a dependent variable y, the regression equation is y = β0 + β1 x + ε.
The values of the error term, ε, average to 0 so E(ε) = 0 and E(y) = β0 + β1 x.
Using observed or sample data for values of x and y, estimates of the parameters β0 and β1 are obtained and the estimated regression line is
where is the value of y that is predicted from the estimated regression line.

According to human capital theory, increased education is associated with greater earnings.
Random sample of 22 Saskatchewan males aged 35-39 with positive wages and salaries in 2004, from the Survey of Labour and Income Dynamics, 2005.
Let x be total number of years of school completed (YRSCHL18) and y be wages and salaries in dollars (WGSAL42).

Source: Statistics Canada, Survey of Labour and Income Dynamics, 2005 [Canada]: External Cross-sectional Economic Person File [machine readable data file]. From IDLS through UR Data Library.

H0: β1 = 0. Schooling has no effect on earnings.
H1: β1 > 0. Schooling has a positive effect on earnings.
From the least squares estimates, using the data for the 22 cases, the regression equation and associate statistics are:
y = -13,493 + 4,181 x.
R2 = 0.253, r = 0. 503.
Standard error of the slope b0 is 1,606.
t = 2.603 (20 df), significance = 0.017.
At α = 0.05, reject H0, accept H1 and conclude that schooling has a positive effect on earnings.
Each extra year of schooling adds $4,181 to annual wages and salaries for those in this sample.
Expected wages and salaries for those with 20 years of schooling is -13,493 + (4,181 x 20) = $70,127.

y = β0 + β1 x. x is the independent variable (on horizontal) and y is the dependent variable (on vertical).
β0 and β1 are the two parameters that determine the equation of the line.
β0 is the y intercept – determines the height of the line.
β1 is the slope of the line.

Find estimates of β0 and β1 that produce a line that fits the points the best.
The most commonly used criterion is least squares.
The least squares line is the unique line for which the sum of the squares of the deviations of the y values from the line is as small as possible.
Minimize the sum of the squares of the errors ε.
Or, equivalent to this, minimize the sum of the squares of the differences of the y values from the values of E(y). That is, find b0 and b1 that minimize:

Yüklə 1,25 Mb.

Dostları ilə paylaş:

1 2