Tải bản đầy đủ (.pptx) (39 trang)

Business analytics data analysis and decision making 5th by wayne l winston chapter 05

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.32 MB, 39 trang )

part.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in

Business Analytics:

Data Analysis and

Chapter

Decision Making

5
Normal, Binomial, Poisson, and Exponential Distributions


Introduction
 Several specific distributions commonly occur in a variety of business
situations:
 Normal distribution—a continuous distribution characterized by a symmetric
bell-shaped curve

 Binomial distribution—a discrete distribution that is relevant when we
sample from a population with only two types of members or when we
perform a series of independent, identical experiments with only two
possible outcomes

 Poisson distribution—a discrete distribution that describes the number of
events in any period of time

 Exponential distributions—a continuous distribution that describes the


times between events

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


The Normal Distribution
 The single most important distribution in statistics is the normal
distribution.
 It is a continuous distribution and is the basis of the familiar symmetric bellshaped curve.

 Any particular normal distribution is specified by its mean and standard
deviation.

 By changing the mean, the normal curve shifts to the right or left.
 By changing the standard deviation, the curve becomes more or less spread out.

 There are really many normal distributions, not just a single one.
 The normal distribution is a two-parameter family, where the two parameters are
the mean and standard deviation.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Continuous Distributions and
Density Functions (slide 1 of 2)
 For continuous distributions, instead of a list of possible values, there
is a continuum of possible values, such as all values between 0 and
100 or all values greater than 0.
 Instead of assigning probabilities to each individual value in the continuum,
the total probability of 1 is spread over this continuum.


 The key to this spreading is called a density function, which acts like a
histogram.

 The higher the value of the density function, the more likely this region of the
continuum is.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Continuous Distributions and
Density Functions (slide 2 of 2)
 A density function, usually denoted by f(x), specifies the probability
distribution of a continuous random variable X.
 The higher f(x) is, the more likely x is.
 The total area between the graph of f(x) and the horizontal axis, which
represents the total probability, is equal to 1.

 f(x) is nonnegative for all possible values of X.
 Probabilities are found from a density function as areas under the curve.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


The Normal Density
 The normal distribution is a continuous distribution with possible
values ranging over the entire number line—from “minus infinity” to
“plus infinity.”
 Only a relatively small range has much chance of occurring.
 The normal density function is actually quite complex, in spite of its “nice”

bell-shaped appearance.

 The formula for the normal density function, where μ and σ are the
mean and standard deviation, is:

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Standardizing: Z-Values
 The standard normal distribution has mean 0 and standard deviation
1, so it is denoted by N(0,1).
 It is also referred to as the Z distribution.
 To standardize a variable, subtract its mean and then divide the
difference by the standard deviation:
 A Z-value is the number of standard deviations to the right or left of the
mean.

 If Z is positive, the original value is to the right of the mean.
 If Z is negative, the original value is the left of the mean.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.1:
Standardizing.xlsx
 Objective: To use Excel® to standardize annual returns of various
mutual funds.
 Solution: Data set includes the annual returns of 30 mutual funds.
 Calculate the mean and standard deviation of each annual return and
then use the standardizing formula to calculate the corresponding Zvalue.

 OR calculate the Z-values directly, using Excel’s STANDARDIZE
function.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Normal Tables and Z-Values
 A common use for Z-values and the standard normal distribution is in
calculating probabilities and percentiles by the traditional method.
 This method is based on a table of the standard normal distribution found in
many statistics textbooks. An example of such a table is given below.

 The body of the table contains probabilities.
 The left and top margins contain possible values.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Normal Calculations in Excel
 Two types of calculations are typically made with normal distributions:
finding probabilities and finding percentiles.

 The functions used for normal probability calculations are NORMDIST and
NORMSDIST.

 The main difference between these is that the one with the
“S” (for standardized) applies only to N(0, 1) calculations, whereas NORMDIST
applies to any normal distribution.

 Percentile calculations that take a probability and return a value are often

called inverse calculations.

 The Excel functions for these are named NORMINV and NORMSINV.
 Again, the “S” in the second of these indicates that it
applies to the standard normal distribution.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.2:
Normal Calculations.xlsx

(slide 1 of 2)

 Objective: To calculate probabilities and percentiles for standard
normal and general normal distributions in Excel.
 Solution: For “less than” probabilities, use NORMDIST or NORMSDIST
directly.
 For “greater than” probabilities, subtract the NORMDIST or
NORMSDIST function from 1.
 For “between” probabilities, subtract the two NORMDIST or
NORMSDIST functions.
 For percentile calculations, use the NORMINV or NORMSINV function
with the specified probability as the first argument.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.2:
Normal Calculations.xlsx


(slide 2 of 2)

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Empirical Rules Revisited
 Three empirical rules apply to many data sets:

 About 68% of the data fall within one standard deviation of the mean.
 About 95% fall within two standard deviations of the mean.
 Almost all fall within three standard deviations of the mean.
 For these rules to hold with real data, the distribution of the data must
be at least approximately symmetric and bell-shaped.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Weighted Sums of Normal
Random Variables
 One very attractive property of the normal distribution is that if you
create a weighted sum of normally distributed random variables, the
weighted sum is also normally distributed.
 This is true even if the random variables are not independent.
 If X1 through Xn are n independent and normally distributed random
variables with common mean μ and common standard deviation σ, then the
sum X1 + … + Xn is normally distributed with mean nμ, variance nσ2, and
standard deviation √nσ.

 If a1 through an are any constants, then the weighted sum a1X1 + … +

anXn is normally distributed with mean a1μ1 + … + anμn and variance a21
σ21 + … + a2n σ2n.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.3:
Personnel Decisions.xlsx
 Objective: To determine test scores that can be used to accept or
reject job applicants at ZTel.
 Solution: Scores of all applicants are approximately normally
distributed with mean 525 and standard deviation 55.
 Calculate the percentage of applicants who are automatic accepts or
rejects, given the current standards of 600 for automatic accept and
425 for automatic reject.
 Find new cutoff values that reject 10% and accept 15% of applicants.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.4:
Paper Machine Settings.xlsx
 Objective: To determine the machine settings that result in paper of
acceptable quality at PaperStock Company.
 Solution: A given roll of paper must be rejected if its actual fiber
content is less than 19.8 pounds or greater than 20.3 pounds.
 The variability in fiber content is 0.10 pound when the process is
“good,” but increases to 0.15 pound when the machine goes “bad.”
 Calculate the probability that a given roll is rejected, for a setting of μ
= 20, when the machine is “good” and when it is “bad.”


© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.5:
Tax on Stock Returns.xlsx
 Objective: To determine the after-tax profit Howard Davis can be 90%
certain of earning.
 Solution: Howard is in the 33% tax bracket, so his after-tax profit is
67% of his before-tax profit. He invests $10,000 in a certain stock,
whose annual return is normally distributed with mean 5% and
standard deviation 14%.
 Calculate the dollar amount such that Howard’s after-tax profit is 90%
certain to be less than this amount; that is, calculate the 90th
percentile of his after-tax profit.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.6:
Oven Demand Simulation.xlsx

(slide 1 of 3)

 Objective: To construct and analyze a spreadsheet model for microwave
oven demand over the next 12 years using Excel’s NORMINV function,
and to show how models using the normal distribution can lead to
nonsensical outcomes unless they are modified appropriately.
 Solution: Using historical data, the company assumes that demand in
year 1 is normally distributed with mean 5000 and standard deviation

1500.
 It also assumes that demand in each subsequent year is normally
distributed with mean equal to the actual demand from the previous year
and standard deviation 1500.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.6:
Oven Demand Simulation.xlsx

(slide 2 of 3)

 Using this model may lead to nonsensical results as shown below:

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.6:
Oven Demand Simulation.xlsx

(slide 3 of 3)

 One way to modify the model is to let the standard deviation and mean
move together. That is, if the mean is low, then the standard deviation
will also be low.

 To be even safer, it is possible to truncate the demand distribution at
some nonnegative value such as 250, as shown below.


© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


The Binomial Distribution
 The binomial distribution is a discrete distribution that can occur in
two situations:
 When sampling from a population with only two types of members (males
and females, for example)

 When performing a sequence of identical experiments, each of which has
only two possible outcomes

 Consider a situation where there are n independent, identical trials,
where the probability of a success on each trial is p and the probability
of a failure is 1 – p.
 Define X to be the random number of successes in the n trials.
 Then X has a binominal distribution with parameters n and p.
 In Excel, calculate binomial probabilities with the BINOMDIST function.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Example 5.7:
Binomial Calculations.xlsx


Objective: To use Excel’s BINOMDIST and CRITBINOM functions for calculating binomial probabilities and
percentiles in the context of flashlight batteries.




Solution: Let X be the number of successes in 100 trials of flashlight batteries, where a success means that the
battery is still functioning after eight hours.



Find the probabilities of various events, using the BINOMDIST function, as shown in the spreadsheet below.



Find the 95th percentile of the distribution of X, using the CRITBINOM function.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


Mean and Standard Deviation of the Binomial
Distribution
 It can be shown that the mean and standard deviation of a binomial
distribution with parameters n and p are given by the following
equations.

 The empirical rules discussed in Chapter 2 also apply, at least
approximately, to the binomial distribution.
 There is about a 95% chance that the actual number of successes will be
within two standard deviations of the mean.

 There is almost no chance that the number of successes will be more than
three standard deviations from the mean.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.



The Binomial Distribution in the
Context of Sampling
 If sampling is done without replacement, each member of the
population can be sampled only once.
 That is, once a person is sampled, his or her name is struck from the list
and cannot be sampled again.

 If sampling is done with replacement, then it is possible, although
maybe not likely, to select a given member of the population more
than once.
 Most real-world sampling is performed without replacement.
 The binomial model applies only to sampling with replacement.
 However, if no more than 10% of the population is sampled, the binomial
model can be used safely even if sampling is performed without
replacement.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


The Normal Approximation
to the Binomial
 If you graph the binomial probabilities, you will see an interesting
phenomenon: the graph begins to look symmetric and bell-shaped
when n is fairly large and p is not too close to 0 or 1.
 The normal distribution provides a very good approximation to the binomial
under these conditions.

 One practical consequence of the normal approximation to the binomial is

that the empirical rules apply very well to binomial distributions.

© 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.


×