Tải bản đầy đủ (.ppt) (58 trang)

Statistics for business economics 7th by paul newbold chapter 12

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.04 MB, 58 trang )

Statistics for
Business and Economics
7th Edition

Chapter 12
Multiple Regression

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-1


Chapter Goals
After completing this chapter, you should be able to:


Apply multiple regression analysis to business decisionmaking situations



Analyze and interpret the computer output for a multiple
regression model



Perform a hypothesis test for all regression coefficients
or for a subset of coefficients



Fit and interpret nonlinear regression models





Incorporate qualitative variables into the regression
model by using dummy variables



Discuss model specification and analyze residuals

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-2


12.1

The Multiple Regression Model

Idea: Examine the linear relationship between
1 dependent (Y) & 2 or more independent variables (X i)
Multiple Regression Model with k Independent Variables:
Y-intercept

Population slopes

Random Error

Y = β0 + β1X1 + β 2 X 2 +  + βk Xk + ε


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-3


Multiple Regression Equation
The coefficients of the multiple regression model are
estimated using sample data
Multiple regression equation with k independent variables:
Estimated
(or predicted)
value of y

Estimated
intercept

Estimated slope coefficients

yˆ i = b0 + b1x1i + b 2 x 2i +  + bk x ki
In this chapter we will always use a computer to obtain the
regression slope coefficients and other regression
summary measures.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-4


Multiple Regression Equation
(continued)


Two variable model
y

pe
o
Sl

fo

i
ar
v
r

le
ab

yˆ = b0 + b1x1 + b 2 x 2

x1

x2

iable x 2
r
a
v
r
o
f

Slope

x1
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-5


Multiple Regression Model
Two variable model
y
yi

Sample
observation

yˆ = b0 + b1x1 + b 2 x 2

<
<

yi

ei = (yi – yi)
x2i

x1
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

<


x1i

x2

The best fit equation, y ,
is found by minimizing the
sum of squared errors, Σe2
Ch. 12-6


Standard Multiple Regression
Assumptions


The values xi and the error terms εi are
independent



The error terms are random variables with mean 0
and a constant variance, σ2.
E[ε i ] = 0 and E[ε i2 ] = σ 2

for (i = 1,  , n)

(The constant variance property is called
homoscedasticity)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


Ch. 12-7


Standard Multiple Regression
Assumptions
(continued)


The random error terms, εi , are not correlated
with one another, so that

E[ε iε j ] = 0


for all i ≠ j

It is not possible to find a set of numbers, c0, c1, . .
. , ck, such that

c 0 + c 1x1i + c 2 x 2i +  + c K x Ki = 0
(This is the property of no linear relation for
the Xj’s)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-8


Example:
2 Independent Variables



A distributor of frozen desert pies wants to
evaluate factors thought to influence demand





Dependent variable:
Pie sales (units per week)
Independent variables:
Advertising ($100’s)
Price (in $)

Data are collected for 15 weeks

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-9


Pie Sales Example
Week

Pie
Sales

Price
($)


Advertising
($100s)

1

350

5.50

3.3

2

460

7.50

3.3

3

350

8.00

3.0

4


430

8.00

4.5

5

350

6.80

3.0

6

380

7.50

4.0

7

430

4.50

3.0


8

470

6.40

3.7

9

450

7.00

3.5

10

490

5.00

4.0

11

340

7.20


3.5

12

300

7.90

3.2

13

440

5.90

4.0

14

450

5.00

3.5

15

300


7.00

2.7

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Multiple regression equation:

Sales = b0 + b1 (Price)
+ b2 (Advertising)

Ch. 12-10


12.2



Estimating a Multiple Linear
Regression Equation

Excel will be used to generate the coefficients and
measures of goodness of fit for multiple regression


Data / Data Analysis / Regression

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-11



Multiple Regression Output
Regression Statistics
Multiple R

0.72213

R Square

0.52148

Adjusted R Square

0.44172

Standard Error

47.46341

Observations

ANOVA  

15

df

Regression


Sales = 306.526 - 24.975(Price) + 74.131(Adv ertising)

SS

MS

F

Significance F

2

29460.027

14730.013

Residual

12

27033.306

2252.776

Total

14

56493.333


Coefficients

Standard Error

Intercept

306.52619

114.25389

2.68285

0.01993

57.58835

555.46404

Price

-24.97509

10.83213

-2.30565

0.03979

-48.57626


-1.37392

74.13096

25.96732

2.85478

0.01449

17.55303

130.70888

 

Advertising

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

 

6.53861
 

t Stat

0.01201
 


P-value

Lower 95%

Upper 95%

Ch. 12-12


The Multiple Regression Equation
Sales = 306.526 - 24.975(Pri ce) + 74.131(Adv ertising)
where
Sales is in number of pies per week
Price is in $
Advertising is in $100’s.

b1 = -24.975: sales
will decrease, on
average, by 24.975
pies per week for
each $1 increase in
selling price, net of
the effects of changes
due to advertising
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

b2 = 74.131: sales will
increase, on average,
by 74.131 pies per
week for each $100

increase in
advertising, net of the
effects of changes
due to price
Ch. 12-13


12.3



Coefficient of Determination, R2
Reports the proportion of total variation in y
explained by all x variables taken together

SSR regression sum of squares
R =
=
SST
total sum of squares
2



This is the ratio of the explained variability to total
sample variability

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-14



Coefficient of Determination, R2
(continued)
Regression Statistics
Multiple R

0.72213

R Square

0.52148

Adjusted R Square

0.44172

Standard Error

15

df

Regression

52.1% of the variation in pie sales
is explained by the variation in
price and advertising

47.46341


Observations

ANOVA  

SSR 29460.0
R =
=
= .52148
SST 56493.3
2

SS

MS

F

Significance F

2

29460.027

14730.013

Residual

12


27033.306

2252.776

Total

14

56493.333

Coefficients

Standard Error

Intercept

306.52619

114.25389

2.68285

0.01993

57.58835

555.46404

Price


-24.97509

10.83213

-2.30565

0.03979

-48.57626

-1.37392

74.13096

25.96732

2.85478

0.01449

17.55303

130.70888

 

Advertising

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


 

6.53861
 

t Stat

0.01201
 

P-value

Lower 95%

Upper 95%

Ch. 12-15


Estimation of Error Variance


Consider the population regression model

Yi = β0 + β1x1i + β 2 x 2i +  + βK x Ki + ε i


The unbiased estimate of the variance of the errors is
n


s 2e =

2
e
∑ i

SSE
n − K −1 n − K −1
i =1

=

where ei = y i − yˆ i


The square root of the variance, se , is called the
standard error of the estimate

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-16


Standard Error, se
Regression Statistics
Multiple R

0.72213

R Square


0.52148

Adjusted R Square

0.44172

Standard Error

15

df

Regression

The magnitude of this
value can be compared to
the average y value

47.46341

Observations

ANOVA  

se = 47.463

SS

MS


F

Significance F

2

29460.027

14730.013

Residual

12

27033.306

2252.776

Total

14

56493.333

Coefficients

Standard Error

Intercept


306.52619

114.25389

2.68285

0.01993

57.58835

555.46404

Price

-24.97509

10.83213

-2.30565

0.03979

-48.57626

-1.37392

74.13096

25.96732


2.85478

0.01449

17.55303

130.70888

 

Advertising

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

 

6.53861
 

t Stat

0.01201
 

P-value

Lower 95%

Upper 95%


Ch. 12-17


Adjusted Coefficient of
Determination, R 2


R2 never decreases when a new X variable is
added to the model, even if the new variable is not
an important predictor variable




This can be a disadvantage when comparing
models

What is the net effect of adding a new variable?




We lose a degree of freedom when a new X
variable is added
Did the new X variable add enough
explanatory power to offset the loss of one
degree of freedom?

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


Ch. 12-18


Adjusted Coefficient of
2
Determination, R
(continued)


Used to correct for the fact that adding non-relevant
independent variables will still reduce the error sum of
squares
SSE / (n − K − 1)
2
R = 1−
SST / (n − 1)
(where n = sample size, K = number of independent variables)


Adjusted R2 provides a better comparison between multiple regression models with different numbers of independent variables



Penalize excessive use of unimportant independent variables



Smaller than R2


Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-19


R
Regression Statistics
Multiple R

0.72213

R Square

0.52148

Adjusted R Square

0.44172

Standard Error

47.46341

Observations

ANOVA  

15

df


Regression

2

R = .44172
2

44.2% of the variation in pie sales is
explained by the variation in price and
advertising, taking into account the sample
size and number of independent variables
SS

MS

F

Significance F

2

29460.027

14730.013

Residual

12


27033.306

2252.776

Total

14

56493.333

Coefficients

Standard Error

Intercept

306.52619

114.25389

2.68285

0.01993

57.58835

555.46404

Price


-24.97509

10.83213

-2.30565

0.03979

-48.57626

-1.37392

74.13096

25.96732

2.85478

0.01449

17.55303

130.70888

 

Advertising

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


 

6.53861
 

t Stat

0.01201
 

P-value

Lower 95%

Upper 95%

Ch. 12-20


Coefficient of Multiple
Correlation


The coefficient of multiple correlation is the correlation
between the predicted value and the observed value of
the dependent variable

R = r(yˆ , y) = R 2







Is the square root of the multiple coefficient of
determination
Used as another measure of the strength of the linear
relationship between the dependent variable and the
independent variables
Comparable to the correlation between Y and X in
simple regression

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-21


Evaluating Individual
Regression Coefficients

12.4



Use t-tests for individual coefficients



Shows if a specific independent variable is
conditionally important




Hypotheses:


H0: βj = 0 (no linear relationship)



H1: βj ≠ 0 (linear relationship does exist
between xj and y)

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

Ch. 12-22


Evaluating Individual
Regression Coefficients
(continued)

H0: βj = 0 (no linear relationship)
H1: βj ≠ 0 (linear relationship does exist
between xi and y)
Test Statistic:

t=

bj − 0

Sb j

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

(df = n – k – 1)

Ch. 12-23


Evaluating Individual
Regression Coefficients
(continued)
Regression Statistics
Multiple R

0.72213

R Square

0.52148

Adjusted R Square

0.44172

Standard Error

47.46341

Observations


ANOVA  

15

df

Regression

t-value for Price is t = -2.306, with
p-value .0398
t-value for Advertising is t = 2.855,
with p-value .0145
SS

MS

F

Significance F

2

29460.027

14730.013

Residual

12


27033.306

2252.776

Total

14

56493.333

Coefficients

Standard Error

Intercept

306.52619

114.25389

2.68285

0.01993

57.58835

555.46404

Price


-24.97509

10.83213

-2.30565

0.03979

-48.57626

-1.37392

74.13096

25.96732

2.85478

0.01449

17.55303

130.70888

 

Advertising

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall


 

6.53861
 

t Stat

0.01201
 

P-value

Lower 95%

Upper 95%

Ch. 12-24


Example: Evaluating Individual
Regression Coefficients
From Excel output:

H0: βj = 0

 

H1: βj ≠ 0


Coefficients

Standard Error

-24.97509
74.13096

Price
Advertising

d.f. = 15-2-1 = 12

t Stat

P-value

10.83213

-2.30565

0.03979

25.96732

2.85478

0.01449

The test statistic for each variable falls
in the rejection region (p-values < .05)


α = .05
t12, .025 = 2.1788

Decision:
α/2=.025

Reject H0 for each variable

α/2=.025

Conclusion:
Reject H0

Do not reject H0

-tα/2
-2.1788

0

Reject H0

tα/2
2.1788

Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall

There is evidence that both
Price and Advertising affect

pie sales at α = .05
Ch. 12-25


×