Tải bản đầy đủ (.pptx) (51 trang)

Business analytics data analysis and decision making 5th by wayne l winston chapter 12

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.69 MB, 51 trang )

part.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in

Business Analytics:

Data Analysis and

Chapter

Decision Making

12
Time Series Analysis and Forecasting


Introduction
 Forecasting is a very difficult task, both in the short run and in the
long run.
 Analysts search for patterns or relationships in historical data and then
make forecasts.
 There are two problems with this approach:
 It is not always easy to undercover historical patterns or relationships.


It is often difficult to separate the noise, or random behavior, from the underlying
patterns.



Some forecasts may attribute importance to patterns that are in fact random variations


and are unlikely to repeat themselves.

 There are no guarantees that past patterns will continue in the future.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Forecasting Methods:
An Overview


There are many forecasting methods available, and there is little
agreement as to the best forecasting method.





The methods can be divided into three groups:

1.

Judgmental methods

2.

Extrapolation (or time series) methods

3.


Econometric (or causal) methods

The first method is basically nonquantitative; the last two are
quantitative.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Extrapolation Models
 Extrapolation models are quantitative models that use past data of
a time series variable to forecast future values of the variable.
 Many extrapolation models are available:
 Trend-based regression
 Autoregression
 Moving averages
 Exponential smoothing
 All of these methods look for patterns in the historical series and then
extrapolate these patterns into the future.
 Complex models are not always better than simpler models.
 Simpler models track only the most basic underlying patterns and can be
more flexible and accurate in forecasting the future.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Econometric Models
 Econometric models, also called causal or regression-based models,
use regression to forecast a time series variable by using other
explanatory time series variables.
 Prediction from regression equation:

 Causal regression models present mathematical challenges, including:
 Determining the appropriate “lags” for the regression equation
 Deciding whether to include lags of the dependent variable as explanatory
variables

 Autocorrelation (correlation of a variable with itself) and cross-correlation
(correlation of a variable with a lagged version of another variable)

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Combining Forecasts
 This method combines two or more forecasts to obtain the final
forecast.

 The reasoning is simple: The forecast errors from different forecasting
methods might cancel one another.

 Forecasts that are combined can be of the same general type, or of
different types.

 The number of forecasts to combine and the weights to use in
combining them have been the subject of several research studies.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Components of Time Series Data
(slide 1 of 4)


 If observations increase or decrease regularly through time, the time
series has a trend.

 Linear trend—occurs if the observations increase by the same amount from
period to period.

 Exponential trend—occurs when observations increase at a tremendous rate.
 S-shape trend—occurs when it takes a while for observations to start
increasing, but then a rapid increase occurs, before finally tapering off to a
fairly constant level.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Components of Time Series Data
(slide 2 of 4)

 If a time series has a seasonal component, it exhibits seasonality—that is, the
same seasonal pattern tends to repeat itself every year.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Components of Time Series Data
(slide 3 of 4)

 A time series has a cyclic component when business cycles affect
the variables in similar ways.
 The cyclic component is more difficult to predict than the seasonal
component, because seasonal variation is much more regular.


 The length of the business cycle varies, sometimes substantially.
 The length of a seasonal cycle is generally one year, while the length of a
business cycle is generally longer than one year and its actual length is
difficult to predict.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Components of Time Series Data
(slide 4 of 4)

 Random variation (or noise) is the unpredictable component that
gives most time series graphs their irregular, zigzag appearance.
 A time series can be determined only to a certain extent by its trend,
seasonal, and cyclic components; other factors determine the rest.

 These other factors combine to create a certain amount of unpredictability
in almost all time series.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Measures of Accuracy
(slide 1 of 2)

 The forecast error is the difference between the actual value
and the forecast. It is denoted by E with appropriate subscripts.
 Forecasting software packages typically report several summary
measures of the forecast errors:

 MAE (Mean Absolute Error):

 RMSE (Root Mean Square Error):

 MAPE (Mean Absolute Percentage Error):

 One other measure of forecast errors is the average of the errors.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Measures of Accuracy
(slide 2 of 2)

 Some forecasting software packages choose the best model from a
given class by minimizing MAE, RMSE, or MAPE.
 However, small values of these measures guarantee only that the model
tracks the historical observations well.

 There is still no guarantee that the model will forecast future values
accurately.

 Unlike residuals from the regression equation, forecast errors are not
guaranteed to always average to zero.
 If the average of the forecast errors is negative, this implies a bias, or that
the forecasts tend to be too high.

 If the average is positive, the forecasts tend to be too low.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.



Testing for Randomness
(slide 1 of 2)

 All forecasting models have the general form shown in the equation
below:

 The fitted value is the part calculated from past data and any other
available information.

 The residual is the forecast error.
 The fitted value should include all components of the original series that
can possibly be forecast, and the leftover residuals should be unpredictable
noise.

 The simplest way to determine whether a time series of residuals is
random noise is to examine time series graphs of residuals visually—
although this is not always reliable.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Testing for Randomness
(slide 2 of 2)

 Some common nonrandom patterns are shown below.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.



The Runs Test
 The runs test is a quantitative method of testing for randomness. It is
a formal test of the null hypothesis of randomness.

 First, choose a base value, which could be the average value of the series,
the median value, or even some other value.

 Then a run is defined as a consecutive series of observations that remain
on one side of this base level.

 If there are too many or too few runs in the series, the null hypothesis of
randomness can be rejected.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.1:
Stereo Sales.xlsx

(slide 1 of 2)

 Objective: To use StatTools’s Runs Test procedure to check
whether the residuals from this simple forecasting model
represent random noise.

 Solution: Data file contains monthly sales for a chain of stereo
retailers from the beginning of 2009 to the end of 2012, during
which there was no upward or downward trend in sales and no
clear seasonality.


 A simple forecast model of sales is to use the average of the
series, 182.67, as a forecast of sales for each month.

 The residuals for this forecasting model are found by subtracting
the average from each observation.

 Use the runs test to see whether there are too many or too few
runs around the base of 0.

 Select Runs Test for Randomness from the StatTools Time Series
and Forecasting dropdown, choose Residual as the variable to
analyze, and choose Mean of Series as the cutoff value.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.1:
Stereo Sales.xlsx

(slide 2 of 2)

 The resulting output is shown below:

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Autocorrelation
 Another way to check for randomness of a time series of residuals is to
examine the autocorrelations of the residuals.
 An autocorrelation is a type of correlation used to measure whether values
of a time series are related to their own past values.


 In positive autocorrelation, large observations tend to follow large observations,
and small observations tend to follow small observations.

 The autocorrelation of lag k is essentially the correlation between the
original series and the lag k version of the series.

 Lags are previous observations, removed by a certain number of periods from the
present time.

 To lag a time series in a spreadsheet by one month, “push down” the series by one
row, as shown below.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.1 (Continued):
Stereo Sales.xlsx (slide 1 of 2)
 Objective: To examine the autocorrelations of the residuals
from the forecasting model for evidence of nonrandomness.
 Solution: Use StatTools’s Autocorrelation procedure, found on
the StatTools Time Series and Forecasting dropdown list.
 Specify the times series variable (Residual), the number of lags you
want, and whether you want a chart of the autocorrelations, called a
correlogram.

 It is common practice to ask for no more lags than 25% of the number of
observations.

 Any autocorrelation that is larger than two standard errors in

magnitude is worth your attention.

 One measure of the lag 1 autocorrelation is provided by the DurbinWatson (DW) statistic.

 A DW value of 2 indicates no lag 1 autocorrelation.
 A DW value less than 2 indicates positive autocorrelation.
 A DW value greater than 2 indicates negative autocorrelation.
© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.1 (Continued):
Stereo Sales.xlsx (slide 2 of 2)
 The autocorrelations and correlogram of the residuals are shown below.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Regression-Based Trend Models
 Many time series follow a long-term trend except for random variation.

 This trend can be upward or downward.
 A straightforward way to model this trend is to estimate a regression
equation for Yt, using time t as the single explanatory variable.

 The two most frequently used trend models are the linear trend and the
exponential trend.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.



Linear Trend
 A linear trend means that the time series variable changes by a
constant amount each time period.
 The equation for the linear trend model is:
 The interpretation of b is that it represents the expected change in the
series from one period to the next.

 If b is positive, the trend is upward.
 If b is negative, the trend is downward.

 The intercept term a is less important: It literally represents the expected
value of the series at time t = 0.

 A graph of the time series indicates whether a linear trend is likely to
provide a good fit.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.2:
US Population.xlsx

(slide 1 of 2)

 Objective: To fit a linear trend line to monthly population and
examine its residuals for randomness.

 Solution: Data file contains monthly population data for the United
States from January 1952 to December 2011. During this period, the
population has increased steadily from about 156 million to about

313 million.

 To estimate the trend with regression, use a numeric time variable
representing consecutive months 1 through 720.

 Then run a simple regression of Population versus Time.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 12.2:
US Population.xlsx

(slide 2 of 2)

 Use Excel’s® Trendline tool to superimpose a trend line on the time series graph.

 Then plot the residuals.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Exponential Trend
 An exponential trend is appropriate when the time series changes by a
constant percentage (as opposed to a constant dollar amount) each
period.
 The appropriate regression equation contains a multiplicative error
term ut:
 This equation is not useful for estimation; for that, a linear equation is
required.

 You can achieve linearity by taking natural logarithms of both sides of the
equation, as shown below, where a = ln(c) and et = ln(ut).

 The coefficient b (expressed as a percentage) is approximately the percentage
change per period. For example, if b = 0.05, then the series is increasing by
approximately 5% per period.

 If a time series exhibits an exponential trend, then a plot of its
logarithm should be approximately linear.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


×