Tải bản đầy đủ (.pdf) (40 trang)

Lecture Undergraduate econometrics - Chapter 2: Some basic probability concepts

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (143.79 KB, 40 trang )

Chapter 2
Some Basic Probability Concepts

2.1

Experiments, Outcomes and Random Variables

• A random variable is a variable whose value is unknown until it is observed. The
value of a random variable results from an experiment; it is not perfectly predictable.
• A discrete random variable can take only a finite number of values, which can be
counted by using the positive integers.
• Discrete variables are also commonly used in economics to record qualitative, or
nonnumerical, characteristics.

In this role they are sometimes called dummy

variables.
• A continuous random variable can take any real value (not just whole numbers) in an
interval on the real number line.
Slide 2.1
Undergraduate Econometrics, 2nd Edition –Chapter 2


2.2

The Probability Distribution of a Random Variable

• The values of random variables are not known until an experiment is carried out, and
all possible values are not equally likely. We can make probability statements about
certain values occurring by specifying a probability distribution for the random
variable.


• If event A is an outcome of an experiment, then the probability of A, which we write
as P(A), is the relative frequency with which event A occurs in many repeated trials of
the experiment. For any event, 0 ≤ P(A) ≤ 1, and the total probability of all possible
event is one.

2.2.1 Probability Distributions of Discrete Random Variables
• When the values of a discrete random variable are listed with their chances of
occurring, the resulting table of outcomes is called a probability function or a
probability density function.
Slide 2.2
Undergraduate Econometrics, 2nd Edition –Chapter 2


• The probability density function spreads the total of 1 “unit” of probability over the set
of possible values that a random variable can take.
• Consider a discrete random variable, X = the number of heads obtained in a single flip
of a coin. The values that X can take are x = 0,1. If the coin is “fair” then the
probability of a head occurring is 0.5. The probability density function, say f(x), for
the random variable X is
Coin Side

x

f(x)

tail

0

0.5


head

1

0.5

• “The probability that X takes the value 1 is 0.5” means that the two values 0 and 1
have an equal chance of occurring and, if we flipped a fair coin a very large number of
times, the value x = 1 would occur 50 percent of the time. We can denote this as P[X
Slide 2.3
Undergraduate Econometrics, 2nd Edition –Chapter 2


= 1] = f(1) = 0.5, where P[X = 1] is the probability of the event that the random
variable X = 1.
• For a discrete random variable X the value of the probability density function f(x) is
the probability that the random variable X takes the value x, f(x) = P(X=x).
• Therefore, 0 ≤ f(x) ≤ 1 and, if X takes n values x1, .., xn, then
f ( x1 ) + f ( x2 ) + L + f ( xn ) = 1.

2.2.2 The Probability Density Function of A Continuous Random Variable
• For the continuous random variable Y the probability density function f(y) can be
represented by an equation, which can be described graphically by a curve. For
continuous random variables the area under the probability density function
corresponds to probability.
• For example, the probability density function of a continuous random variable Y might
be represented as in Figure 2.1. The total area under a probability density function is 1,
Slide 2.4
Undergraduate Econometrics, 2nd Edition –Chapter 2



and the probability that Y takes a value in the interval [a, b], or P[a ≤ Y ≤ b], is the area
under the probability density function between the values y = a and y = b. This is
shown in Figure 2.1 by the shaded area.
• Since a continuous random variable takes an uncountable infinite number of values,
the probability of any one occurring is zero. That is, P[Y = a] = P[a ≤ Y ≤ a] = 0.
• In calculus, the integral of a function defines the area under it, and therefore

P[a ≤ Y ≤ b] = ∫

b
y =a

f ( y)dy .

• For any random variable x, the probability that x is less than or equal to a is denoted
F(a). F(x) is the cumulative distribution function (cdf).
• For a discrete random variable,

F ( x) =

f ( x) = Prob( X ≤ x)

X ≤x

Slide 2.5
Undergraduate Econometrics, 2nd Edition –Chapter 2



In view of the definition of f(x),
f ( xi ) = F ( xi ) − F ( xi−1)
• For a continuous random variable,
F ( x) = ∫

x

−∞

f (t )dt

and
f ( x) = dF ( x)
dx
• In both the continuous and discrete cases, F(x) must satisfy the following properties:
Slide 2.6
Undergraduate Econometrics, 2nd Edition –Chapter 2


1. 0 ≤ F ( x) ≤ 1.
2. If x ≥ y , then F ( x) ≥ F ( y) .
3. F (+∞) = 1.
4. F (−∞) = 0 .
5. Prob(a < x ≤ b) = F (b) − F (a) .

Slide 2.7
Undergraduate Econometrics, 2nd Edition –Chapter 2


2.3


Expected Values Involving a Single Random Variable

• When working with random variables, it is convenient to summarize their probability
characteristics using the concept of mathematical expectation. These expectations will
make use of summation notation.

2.3.1 The Rules of Summation
1. If X takes n values x1, ..., xn then their sum is
n

∑x
i =1

i

= x1 + x2 + L + xn

2. If a is a constant, then
n

∑ a = na
i =1

Slide 2.8
Undergraduate Econometrics, 2nd Edition –Chapter 2


3. If a is a constant then it can be pulled out in front of a summation
n


∑ ax
i =1

i

n

=a ∑ xi
i =1

4. If X and Y are two variables, then
n

n

n

∑ ( x + y ) =∑ x + ∑ y
i =1

i

i

i =1

i

i


i =1

5. If a and b are constants, then
n

n

∑ (a + bx ) =na + b∑ x
i

i =1

i =1

i

6. If X and Y are two variables, then
n

n

n

∑ (ax + by ) = a∑ x + b∑ y
i =1

i

i


i =1

i

i =1

i

Slide 2.9
Undergraduate Econometrics, 2nd Edition –Chapter 2


7. The arithmetic mean (average) of n values of X is
n

∑x

x=

i =1

i

n

=

x1 + x2 + L + xn
.

n

Also,
n

∑(x − x ) = 0
i

i =1

8. We often use an abbreviated form of the summation notation. For example, if f(x) is a
function of the values of X,
n

∑ f (x ) = f (x ) + f (x ) +L + f (x )
i =1

i

1

2

n

= ∑ f ( xi ) ("Sum over all values of the index i")
i

= ∑ f ( x) ("Sum over all possible values of X ")
x


Slide 2.10
Undergraduate Econometrics, 2nd Edition –Chapter 2


9. Several summation signs can be used in one expression. Suppose the variable Y takes
n values and X takes m values, and let f(x,y) = x+y. Then the double summation of
this function is
m

n

m

n

∑∑ f ( x , y ) = ∑∑ ( x + y )
i =1 j =1

i

j

i =1 j =1

i

j

To evaluate such expressions work from the innermost sum outward. First set i=1 and

sum over all values of j, and so on. To illustrate, let m = 2 and n = 3. Then
2

3

2

∑∑ f ( x , y ) = ∑  f ( x , y ) + f ( x , y ) + f ( x , y )
i =1 j =1

i

j

1

i

i =1

i

2

i

3

= f ( x1 , y1 ) + f ( x1 , y2 ) + f ( x1 , y3 ) +
f ( x2 , y1 ) + f ( x2 , y2 ) + f ( x2 , y3 )

The order of summation does not matter, so
m

n

n

m

∑∑ f ( x , y ) = ∑∑ f ( x , y )
i =1 j =1

i

j

j =1 i =1

i

j

Slide 2.11
Undergraduate Econometrics, 2nd Edition –Chapter 2


2.3.2 The Mean of a Random Variable
• The expected value of a random variable X is the average value of the random variable
in an infinite number of repetitions of the experiment (repeated samples); it is denoted
E[X].

• If X is a discrete random variable which can take the values x1, x2,…,xn with
probability density values f(x1), f(x2),…, f(xn), the expected value of X is

E[ X ] = x1 f ( x1 ) + x2 f ( x2 ) + L + xn f ( xn )
n

= ∑ xi f ( xi )

(2.3.1)

i =1

= ∑ xf ( x)
x

Slide 2.12
Undergraduate Econometrics, 2nd Edition –Chapter 2


• If X is a continuous random variable, the expected value of X is

E[ X ] = ∫ xf ( x)dx
x

The notation

∫x

means integral over the entire range of values of x.


2.3.3 Expectation of a Function of a Random Variable
• If X is a discrete random variable and g(X) is a function of it, then
E[ g ( X )] = ∑ g ( x) f ( x)

(2.3.2a)

x

Slide 2.13
Undergraduate Econometrics, 2nd Edition –Chapter 2


However, E[ g ( X )] ≠ g[ E ( X )] in general.
• If X is a discrete random variable and g(X) = g1(X) + g2(X), where g1(X) and g2(X) are
functions of X, then

E[ g ( X )] = ∑ [ g1 ( x) + g 2 ( x)] f ( x)
x

= ∑ g1 ( x) f ( x) + ∑ g 2 ( x) f ( x)
x

(2.3.2b)

x

= E[ g1 ( x)] + E[ g 2 ( x)]

The expected value of a sum of functions of random variables, or the expected value
of a sum of random variables, is always the sum of the expected values.


Slide 2.14
Undergraduate Econometrics, 2nd Edition –Chapter 2


• The idea of how to determine the expected value of a function of a continuous random
variable Y, say g(y), is exactly the same as in the discrete case. The terms g(y) must be
weighted by f(y) and then all those products summed. This operation is carried out via
integration, but the interpretation of the result is the same. Specifically, if Y is a
continuous random variable, then
E[ g ( y)] = ∫ g ( y) f ( y)dy
y

• Some properties of mathematical expectation work for both discrete and continuous
random variable. For the discrete case, these results are shown as follows:
1. If c is a constant,

E[c] = c

(2.3.3a)

2. If c is a constant and X is a random variable, then

E[cX ] = cE[ X ]

(2.3.3b)
Slide 2.15

Undergraduate Econometrics, 2nd Edition –Chapter 2



3. If a and c are constants and X is a random variable, then

E[ a + cX ] = a + cE[ X ]

(2.3.3c)

4. If a, b, and c are constants and X and Y are random variables, then

E[aX + bY + c] = aE[ X ] + bE[Y ] + c
• A conditional mean is the mean of the conditional distribution and is defined by
E[ y | x] = ∑ yf ( y | x)

if y is discrete

y

E[ y | x] = ∫ yf ( y | x)dy

if y is continuous

y

The conditional mean function E[y|x] is called the regression of y on x.

Slide 2.16
Undergraduate Econometrics, 2nd Edition –Chapter 2


2.3.4 The Variance of a Random Variable

• The variance of a discrete or continuous random variable X, based on the rules in
Section 2.3.3, is defined as the expected value of g ( X ) = [ X − E ( X )]2 . Algebraically,
var( X ) = σ2 = E[ g ( X )] = E[ X − E ( X )]2 = E[ X − µ]2
= E[ X 2 ] −[ E ( X )]2
= ∑ ( x − µ)2 f ( x)

(2.3.4)
if x is discrete

x

= ∫ ( x − µ)2 f ( x)dx

if x is continuous

x

where E[ X ] = µ . Examining g(X) = [X – E(X)]2, we observe that the variance of a
random variable is the average squared difference between the random variable X and
its mean variable E[X]. Thus, the variance of a random variable is the weighted
Slide 2.17
Undergraduate Econometrics, 2nd Edition –Chapter 2


average of the squared differences (or distances) between the values x of the random
variable X and the mean (center of the probability density function) of the random
variable. The larger the variance of a random variable, the greater the average squared
distance between the values of the random variable and its mean, or the more “spread
out” are the values of the random variable.
• Let a and c be constants, and let Z = a + cX. Then Z is a random variable and its

variance is
var(a + cX) = E[(a + cX) – E(a + cX)]2 = c2var(X)

(2.3.5)

The result in Equation (2.3.5) says that if you:
1. Add a constant to a random variable it does not affect its variance, or dispersion.
This fact follows, since adding a constant to a random variable shifts the location of
its probability density function but leaves its shape, and dispersion, unaffected.
Slide 2.18
Undergraduate Econometrics, 2nd Edition –Chapter 2


2. Multiply a random variable by a constant, the variance is multiplied by the square
of the constant.
• The square root of the variance of a random variable is called the standard deviation;
it is denoted by σ. It, too, measures the spread or dispersion of a distribution, and it
has the advantage of being in the same units of measure as the random variable.
• A conditional variance is the variance of the conditional distribution:

var[ y | x] = E[( y − E[ y | x])2 | x]
= ∑ ( y − E[ y | x])2 f ( y | x)

if y is discrete

y

var[ y | x] = ∫ ( y − E[ y | x])2 f ( y | x)dy

if y is continuous


y

The computation can be simplified by using var[y|x] = E[y2|x] – (E[y|x])2.

Slide 2.19
Undergraduate Econometrics, 2nd Edition –Chapter 2


• Two other measures often used to describe a probability distribution are
skewness = E[( x − µ)3]
and
kurtosis = E[( x − µ)4 ]
Skewness is a measure of the asymmetry of a distribution.

For symmetric

distributions, f (µ − x) = f (µ + x) and skewness = 0. For asymmetric distributions, the
skewness will be positive (negative) if the “long tail” is in the positive (negative)
direction. Kurtosis is a measure of the thickness of the tails of the distribution.

Slide 2.20
Undergraduate Econometrics, 2nd Edition –Chapter 2


• Two common measures are
3
skewness coefficient = E[( x −3 µ) ]
σ


and
4]
[(
µ)
E
x

−3
degree of excess =
σ4

The second is based on the normal distribution, which has excess of zero.

Slide 2.21
Undergraduate Econometrics, 2nd Edition –Chapter 2


2.4

Using Joint Probability Density Functions

Frequently we want to make probability statements about more than one random variable
at a time. To answer probability questions involving two or more random variables, we
must know their joint probability density function. For the continuous random variables
X and Y, we use f(x,y) to represent their joint density function. A typical joint density
function might look something like Figure 2.3. See Example 2.5.

2.4.1 Marginal Probability Density Functions
• If X and Y are two discrete random variables then


Slide 2.22
Undergraduate Econometrics, 2nd Edition –Chapter 2


f ( x ) = ∑ f ( x, y )

for each value X can take

y

(2.4.1)
f ( y ) = ∑ f ( x, y )

for each value Y can take

x

Note that the summations in Equation (2.4.1) are over the other random variable, the
one that we are eliminating from the joint probability density function. If the random
variables are continuous the same idea works, with integrals replacing the summation
sign as follows:

f ( x) = ∫ f ( x, y)dy
y

f ( y) = ∫ f ( x, y)dx
x

Slide 2.23
Undergraduate Econometrics, 2nd Edition –Chapter 2



2.4.2 Conditional Probability Density Functions
• Often the chances of an event occurring are conditional on the occurrence of another
event.

For discrete random variables X and Y, conditional probabilities can be

calculated from the joint probability density function f(x,y) and the marginal
probability density function of the conditioning random variables. Specifically, the
probability that the random variable X takes the value x given that Y = y, is written P[X
= x|Y = y]. This conditional probability is given by the conditional probability density
function f(x|y):

Slide 2.24
Undergraduate Econometrics, 2nd Edition –Chapter 2


f ( x | y ) = P[ X = x | Y = y ] =

f ( x, y )
f ( y)
(2.4.2)

f ( y | x) = P[Y = y | X = x] =

f ( x, y )
f ( x)

2.4.3 Independent Random Variables

• Two random variables are statistically independent, or independently distributed, if
knowing the value that one will take does not reveal anything about what value the
other may take.

When random variables are statistically independent, their joint

probability density function factors into the product of their individual probability
density functions, and vice versa. If X and Y are independent random variables, then

f ( x, y ) = f ( x ) f ( y )

(2.4.3)
Slide 2.25

Undergraduate Econometrics, 2nd Edition –Chapter 2


×