Tải bản đầy đủ (.pdf) (390 trang)

Theory of Financial Decision Making pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.54 MB, 390 trang )

Theory of Financial
Decision Making
Jonathan E. Ingersoll, Jr.
Yale University
Preface
In the past twenty years the quantity of new and exciting research in finance has been large,
and a sizable body of basic material now lies at the core of our area of study. It is the purpose
of this book to present this core in a systematic and thorough fashion. The notes for this
book have been the primary text for various doctoral-level courses in financial theory that I
have taught over the past eight years at the University of Chicago and Yale University. In
a11 the courses these notes have been supplemented with readings selected from journals.
Reading original journal articles is an integral part of learning an academic field, since it
serves to introduce the students to the ongoing process of research, including its mis-steps
and controversies. In my opinion any program of study would be amiss not to convey this
continuing growth.
This book is structured in four parts. The first part, Chapters 1-3, provides an intro-
duction to utility theory, arbitrage, portfolio formation, and efficient markets. Chapter 1
provides some necessary background in microeconomics. Consumer choice is reviewed,
and expected utility maximization is introduced. Risk aversion and its measurement are
also covered.
Chapter 2 introduces the concept of arbitrage. The absence of arbitrage is one of the
most convincing and, therefore, farthest-reaching arguments made in financial economics.
Arbitrage reasoning is the basis for the arbitrage pricing theory, one of the leading models
purporting to explain the cross-sectional difference in asset returns, Perhaps more impor-
tant, the absence of arbitrage is the key in the development of the Black-Scholes option
pricing model and its various derivatives, which have been used to value a wide variety of
claims both in theory and in practice.
Chapter 3 begins the study of single-period portfolio problems. It also introduces the
student to the theory of efficient markets: the premise that asset prices fully reflect all
information available to the market. The theory of efficient (or rational) markets is one of
the cornerstones of modern finance; it permeates almost all current financial research and


has found wide acceptance among practitioners, as well.
In the second main section, Chapters 4-9 cover single-period equilibrium models. Chap-
ter 4 covers mean-variance analysis and the capital asset pricing model - a model which
has found many supporters and widespread applications. Chapters 5 through 7 expand on
Chapter 4. The first two cover generalized measures of risk and additional mutual fund
theorems. The latter treats linear factor models and the arbitrage pricing theory, probably
the key competitor of the CAPM.
Chapter 8 offers an alternative equilibrium view based on complete markets theory. This
theory was originally noted for its elegant treatment of general equilibrium as in the models
of Arrow and Debreu and was considered to be primarily of theoretical interest. More
recently it and the related concept of spanning have found many practical applications in
contingent-claims pricing.
ii
Chapter 9 reviews single-period finance with an overview of how the various models
complement one another. It also provides a second view of the efficient markets hypothesis
in light of the developed equilibrium models.
Chapter 10, which begins the third main section on multiperiod models, introduces mod-
els set in more than one period. It reviews briefly the concept of discounting, with which
it is assumed the reader is already acquainted, and reintroduces efficient markets theory in
this context.
Chapters 11 and 13 examine the multiperiod portfolio problem. Chapter 11 introduces
dynamic programming and the induced or derived singleperiod portfolio problem inherent
in the intertemporal problem. After some necessary mathematical background provided in
Chapter 12, Chapter 13 tackles the same problem in a continuous-time setting using the
meanvariance tools of Chapter 4. Merton’s intertemporal capital asset pricing model is
derived, and the desire of investors to hedge is examined.
Chapter 14 covers option pricing. Using arbitrage reasoning it develops distribution-
free and preference-free restrictions on the valuation of options and other derivative assets.
It culminates in the development of the Black-Scholes option pricing model. Chapter 15
summarizes multiperiod models and provides a view of how they complement one another

and the single-period models. It also discusses the role of complete markets and spanning
in a multiperiod context and develops the consumption- based asset pricing model.
In the final main section, Chapter 16 is a second mathematical interruption- this time
to introduce the Ito calculus. Chapter 17 explores advanced topics in option pricing using
Ito calculus. Chapter 18 examines the term structure of interest rates using both option
techniques and multiperiod portfolio analysis. Chapter 19 considers questions of corporate
capital structure. Chapter 19 demonstrates many of the applications of the Black-Scholes
model to the pricing of various corporate contracts.
The mathematical prerequisites of this book have been kept as simple as practicable. A
knowledge of calculus, probability and statistics, and basic linear algebra is assumed. The
Mathematical Introduction collects some required concepts from these areas. Advanced
topics in stochastic processes and Ito calcu1us are developed heuristically, where needed,
because they have become so important in finance. Chapter 12 provides an introduction
to the stochastic processes used in continuous-time finance. Chapter 16 is an introduction
to Ito calculus. Other advanced mathematical topics, such as measure theory, are avoided.
This choice of course, requires that rigor or generality sometimes be sacrificed to intuition
and understanding. Major points are always presented verbally as well as mathematically.
These presentations are usually accompanied by graphical illustrations and numerical ex-
amples.
To emphasize the theoretical framework of finance, many topics have been left uncov-
ered. There is virtually no description of the actual operation of financial markets or of the
various institutions that play vital roles. Also missing is a discussion of empirical tests of
the various theories. Empirical research in finance is perhaps more extensive than theoret-
ical, and any adequate review would require a complete book itself. The effects of market
imperfections are also not treated. In the first place, theoretical results in this area have not
yet been fully developed. In addition the predictions of the perfect market models seem to
be surprisingly robust despite the necessary simplifying assumptions. In any case an un-
derstanding of the workings of perfect markets is obviously a precursor to studying market
imperfections.
The material in this book (together with journal supplements) is designed for a full year’s

iii
study. Shorter courses can also be designed to suit individual tastes and prerequisites. For
example, the study of multiperiod models could commence immediately after Chapter 4.
Much of the material on option pricing and contingent claims (except for parts of Chapter
18 on the term structure of interest rates) does not depend on the equilibrium models and
could be studied immediately after Chapter 3.
This book is a text and not a treatise. To avoid constant interruptions and footnotes,
outside references and other citations have been kept to a minimum. An extended chapter-
by-chapter bibliography is provided, and my debt to the authors represented there should be
obvious to anyone familiar with the development of finance. It is my hope that any student
in the area also will come to learn of this indebtedness.
I am also indebted to many colleagues and students who have read, or in some cases
taught from, earlier drafts of this book. Their advice, suggestions, and examples have all
helped to improve this product, and their continuing requests for the latest revision have
encouraged me to make it available in book form.
Jonathan Ingersoll, Jr.
New Haven
November 1986
Glossary of Commonly Used Symbols
a Often the parameter of the exponential utility function u(Z) = −exp(−aZ).
B The factor loading matrix in the linear model.
b Often the parameter of the quadratic utility function u(Z) = Z −bZ
2
/2.
b
k
i
= Cov(˜z
i
,

˜
Z
k
e
)/(Cov(˜z
k
e
,
˜
Z
k
e
)). A measure of systematic risk for the ith asset
with respect to the kth efficient portfolio. Also the loading of the ith asset
on the kth factor, the measure of systematic risk in the factor model.
C Consumption.
E The expectation operator. Expectations are also often denoted with an overbar¯.
e The base for natural logarithms and the exponential function. e ≈ 2.71828.
¯
f A factor in the linear factor model.
I The identity matrix.
i As a subscript it usually denotes the ith asset.
J A derived utility of wealth function in intertemporal portfolio models.
j As a subscript it usually denotes the Jth asset.
K The call price on a callable contingent claim.
k As a subscript or superscript it usually denotes the kth investor.
L Usually a Lagrangian expression.
m As a subscript or superscript it usually denotes the market portfolio.
N The number of assets.
N(·) The cumulative normal distribution function.

n(·) The standard normal density function.
O(·) Asymptotic order symbol. Function is of the same as or smaller order than its
argument.
o(·) Asymptotic order symbol. Function is of smaller order than its argument.
p The supporting state price vector.
q Usually denotes a probability.
R The riskless return (the interest rate plus one).
r The interest rate. r ≡ R − 1.
S In single-period models, the number of states. In intertemporal models, the
price of a share of stock.
s As a subscript or superscript it usually denotes state s.
T Some fixed time, often the maturity date of an asset.
t Current time.
t The tangency portfolio in the mean-variance portfolio problem.
U A utility of consumption function.
u A utility of return function.
V A derived utility function.
v The values of the assets.
W Wealth.
W (S , τ) The Black-Scholes call option pricing function on a stock with price S and
time to maturity of τ.
w A vector of portfolio weights. w
i
is the fraction of wealth in the ith asset.
X The exercise price for an option.
Y The state space tableau of payoffs. Y
si
is the payoff in state s on asset i.
Z The state space tableau of returns. Z
si

is the return in state s on asset i.
v
ˆ
Z
w
The return on portfolio w.
z As a subscript it denotes the zero beta portfolio.
˜
z The random returns on the assets.
¯
z The expected returns on the assets.
0 A vector or matrix whose elements are 0.
1 A vector whose elements are 1.
> As a vector inequality each element of the left-hand vector is greater than the
corresponding element of the right-hand vector. < is similarly defined.
 As a vector inequality each element of the left-hand vector is greater than
or equal to the corresponding element of the right-hand vector, and at least
one element is strictly greater.  is similarly defined.
 As a vector inequality each element of the left-hand vector is greater than
or equal to the corresponding element of the righthand vector.  is similarly
defined.
α The expected, instantaneous rate of return on an asset.
β ≡ Cov(˜z,
˜
Z
m
). The beta of an asset.
γ Often the parameter of the power utility function u(Z) = Z
γ
/γ.

∆ A first difference.
˜ε The residual portion of an asset’s return.
η A portfolio commitment of funds not nomalized.
Θ A martingale pricing measure.
ι
j
The Jth column of the identity matrix.
˜
Λ The state price per unit probability; a martingale pricing measure.
λ Usually a Lagrange multiplier.
λ The factor risk premiums in the APT.
υ A portfolio of Arrow-Debreu securities. υ
s
is the number of state s securities held.
π The vector of state probabilities.
ρ A correlation coefficient.
Σ The variance-covariance matrix of returns.
σ A standard deviation, usually of the return on an asset.
τ The time left until maturity of a contract.
Φ Public information.
φ
k
Private information of investor k.
ω An arbitrage portfolio commitment of funds (1

ω = 0).
ω A Gauss-Wiener process. dω is the increment to a Gauss-Wiener process.
Contents
0.1 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
0.2 Matrices and Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . 4

0.3 Constrained Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 6
0.4 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1 Utility Theory 1
1.1 Utility Functions and Preference Orderings . . . . . . . . . . . . . . . . . 1
1.2 Properties of Ordinal Utility Functions . . . . . . . . . . . . . . . . . . . 2
1.3 Properties of Some Commonly Used Ordinal Utility Functions . . . . . . . 4
1.4 The Consumer’s Allocation Problem . . . . . . . . . . . . . . . . . . . . 5
1.5 Analyzing Consumer Demand . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Solving a Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.7 Expected Utility Maximization . . . . . . . . . . . . . . . . . . . . . . . 9
1.8 Cardinal and Ordinal Utility . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.9 The Independence Axiom . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.10 Utility Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.11 Utility of Wealth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.12 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.13 Some Useful Utility Functions . . . . . . . . . . . . . . . . . . . . . . . 16
1.14 Comparing Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.15 Higher-Order Derivatives of the Utility Function . . . . . . . . . . . . . . 18
1.16 The Boundedness Debate: Some History of Economic Thought . . . . . . 19
1.17 Multiperiod Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . 19
2 Arbitrage and Pricing: The Basics 22
2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2 Redundant Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Contingent Claims and Derivative Assets . . . . . . . . . . . . . . . . . . 26
2.4 Insurable States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Dominance And Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Pricing in the Absence of Arbitrage . . . . . . . . . . . . . . . . . . . . . 29
2.7 More on the Riskless Return . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.8 Riskless Arbitrage and the “Single Price Law Of Markets” . . . . . . . . . 33
2.9 Possibilities and Probabilities . . . . . . . . . . . . . . . . . . . . . . . . 34

2.10 “Risk-Neutral” Pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.11 Economies with a Continuum of States . . . . . . . . . . . . . . . . . . . 36
CONTENTS vii
3 The Portfolio Problem 38
3.1 The Canonical Portfolio Problem . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Optimal Portfolios and Pricing . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Properties of Some Simple Portfolios . . . . . . . . . . . . . . . . . . . . 41
3.4 Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 The Theory of Efficient Markets . . . . . . . . . . . . . . . . . . . . . . . 44
3.6 Efficient Markets in a “Riskless” Economy . . . . . . . . . . . . . . . . . 45
3.7 Information Aggregation and Revelation in Efficient Markets: The General
Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.8 Simple Examples of Information Revelation in an Efficient Market . . . . . 48
4 Mean-Variance Portfolio Analysis 52
4.1 The Standard Mean-Variance Portfolio Problem . . . . . . . . . . . . . . 52
4.2 Covariance Properties of the Minimum-Variance Portfolios . . . . . . . . . 56
4.3 The Mean-Variance Problem with a Riskless Asset . . . . . . . . . . . . . 56
4.4 Expected Returns Relations . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Equilibrium: The Capital Asset Pricing Model . . . . . . . . . . . . . . . 59
4.6 Consistency of Mean-Variance Analysis and Expected Utility Maximization 62
4.7 Solving A Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.8 The State Prices Under Mean-Variance Analysis . . . . . . . . . . . . . . 65
4.9 Portfolio Analysis Using Higher Moments . . . . . . . . . . . . . . . . . 65
A The Budget Constraint 68
B The Elliptical Distributions 70
B.1 Some Examples of Elliptical Variables . . . . . . . . . . . . . . . . . . . 72
B.2 Solving a Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 75
B.3 Preference Over Mean Return . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Generalized Risk, Portfolio Selection, and Asset Pricing 78
5.1 The Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.2 Risk: A Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.3 Mean Preserving Spreads . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 Rothschild And Stiglitz Theorems On Risk . . . . . . . . . . . . . . . . . 82
5.5 The Relative Riskiness of Opportunities with Different Expectations . . . . 83
5.6 Second-Order Stochastic Dominance . . . . . . . . . . . . . . . . . . . . 84
5.7 The Portfolio Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.8 Solving A Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.9 Optimal and Efficient Portfolios . . . . . . . . . . . . . . . . . . . . . . . 87
5.10 Verifying The Efficiency of a Given Portfolio . . . . . . . . . . . . . . . . 89
5.11 A Risk Measure for Individual Securities . . . . . . . . . . . . . . . . . . 92
5.12 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
A Stochastic Dominance 96
A.1 Nth-Order Stochastic Dominance . . . . . . . . . . . . . . . . . . . . . . 97
viii CONTENTS
6 Portfolio Separation Theorems 99
6.1 Inefficiency of The Market Portfolio: An Example . . . . . . . . . . . . . 99
6.2 Mutual Fund Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3 One-Fund Separation Under Restrictions on Utility . . . . . . . . . . . . . 103
6.4 Two-Fund Separation Under Restrictions on Utility . . . . . . . . . . . . . 103
6.5 Market Equilibrium Under Two-Fund, Money Separation . . . . . . . . . 105
6.6 Solving A Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.7 Distributional Assumptions Permitting One-Fund Separation . . . . . . . . 107
6.8 Distributional Assumption Permitting Two-Fund, Money Separation . . . . 108
6.9 Equilibrium Under Two-Fund, Money Separation . . . . . . . . . . . . . . 110
6.10 Characterization of Some Separating Distributions . . . . . . . . . . . . . 110
6.11 Two-Fund Separation with No Riskless Asset . . . . . . . . . . . . . . . . 111
6.12 K-Fund Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.13 Pricing Under K-Fund Separation . . . . . . . . . . . . . . . . . . . . . . 115
6.14 The Distinction between Factor Pricing and Separation . . . . . . . . . . . 115
6.15 Separation Under Restrictions on Both Tastes and Distributions . . . . . . 117

7 The Linear Factor Model: Arbitrage Pricing Theory 120
7.1 Linear Factor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.2 Single-Factor, Residual-Risk-Free Models . . . . . . . . . . . . . . . . . 120
7.3 Multifactor Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4 Interpretation of the Factor Risk Premiums . . . . . . . . . . . . . . . . . 122
7.5 Factor Models with “Unavoidable” Risk . . . . . . . . . . . . . . . . . . 122
7.6 Asymptotic Arbitrage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.7 Arbitrage Pricing of Assets with Idiosyncratic Risk . . . . . . . . . . . . . 125
7.8 Risk and Risk Premiums . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.9 Fully Diversified Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . 128
7.10 Interpretation of the Factor Premiums . . . . . . . . . . . . . . . . . . . . 130
7.11 Pricing Bounds in A Finite Economy . . . . . . . . . . . . . . . . . . . . 133
7.12 Exact Pricing in the Linear Model . . . . . . . . . . . . . . . . . . . . . . 134
8 Equilibrium Models with Complete Markets 136
8.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.2 Valuation in Complete Markets . . . . . . . . . . . . . . . . . . . . . . . 137
8.3 Portfolio Separation in Complete Markets . . . . . . . . . . . . . . . . . . 137
8.4 The Investor’s Portfolio Problem . . . . . . . . . . . . . . . . . . . . . . 138
8.5 Pareto Optimality of Complete Markets . . . . . . . . . . . . . . . . . . . 139
8.6 Complete and Incomplete Markets: A Comparison . . . . . . . . . . . . . 140
8.7 Pareto Optimality in Incomplete Markets: Effectively Complete Markets . 140
8.8 Portfolio Separation and Effective Completeness . . . . . . . . . . . . . . 141
8.9 Efficient Set Convexity with Complete Markets . . . . . . . . . . . . . . . 143
8.10 Creating and Pricing State Securities with Options . . . . . . . . . . . . . 144
9 General Equilibrium Considerations in Asset Pricing 147
9.1 Returns Distributions and Financial Contracts . . . . . . . . . . . . . . . . 147
9.2 Systematic and Nonsystematic Risk . . . . . . . . . . . . . . . . . . . . . 153
9.3 Market Efficiency with Nonspeculative Assets . . . . . . . . . . . . . . . 154
9.4 Price Effects of Divergent Opinions . . . . . . . . . . . . . . . . . . . . . 158
CONTENTS ix

9.5 Utility Aggregation and the “Representative” Investor . . . . . . . . . . . 161
10 Intertemporal Models in Finance 163
10.1 Present Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
10.2 State Description of a Multiperiod Economy . . . . . . . . . . . . . . . . 163
10.3 The Intertemporal Consumption Investment Problem . . . . . . . . . . . . 166
10.4 Completion of the Market Through Dynamic Trading . . . . . . . . . . . 168
10.5 Intertemporally Efficient Markets . . . . . . . . . . . . . . . . . . . . . . 170
10.6 Infinite Horizon Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11 Discrete-time Intertemporal Portfolio Selection 175
11.1 Some Technical Considerations . . . . . . . . . . . . . . . . . . . . . . . 187
A Consumption Portfolio Problem when Utility Is Not Additively Separable 188
B Myopic and Turnpike Portfolio Policies 193
B.1 Growth Optimal Portfolios . . . . . . . . . . . . . . . . . . . . . . . . . 193
B.2 A Caveat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
B.3 Myopic Portfolio Policies . . . . . . . . . . . . . . . . . . . . . . . . . . 195
B.4 Turnpike Portfolio Policies . . . . . . . . . . . . . . . . . . . . . . . . . 195
12 An Introduction to the Distributions of Continuous-Time Finance 196
12.1 Compact Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
12.2 Combinations of Compact Random Variables . . . . . . . . . . . . . . . . 198
12.3 Implications for Portfolio Selection . . . . . . . . . . . . . . . . . . . . . 198
12.4 “Infinitely Divisible” Distributions . . . . . . . . . . . . . . . . . . . . . 200
12.5 Wiener and Poisson Processes . . . . . . . . . . . . . . . . . . . . . . . . 202
12.6 Discrete-Time Approximations for Wiener Processes . . . . . . . . . . . . 204
13 Continuous-Time Portfolio Selection 206
13.1 Solving a Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 208
13.2 Testing The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
13.3 Efficiency Tests Using the Continuous-Time CAPM . . . . . . . . . . . . 213
13.4 Extending The Model to Stochastic Opportunity Sets . . . . . . . . . . . . 213
13.5 Interpreting The Portfolio Holdings . . . . . . . . . . . . . . . . . . . . . 215
13.6 Equilibrium in the Extended Model . . . . . . . . . . . . . . . . . . . . . 218

13.7 Continuous-Time Models with No Riskless Asset . . . . . . . . . . . . . . 219
13.8 State-Dependent Utility of Consumption . . . . . . . . . . . . . . . . . . 220
13.9 Solving A Specific Problem . . . . . . . . . . . . . . . . . . . . . . . . . 221
13.10A Nominal Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
14 The Pricing of Options 227
14.1 Distribution and Preference-Free Restrictions on Option Prices . . . . . . . 227
14.2 Option Pricing: The Riskless Hedge . . . . . . . . . . . . . . . . . . . . . 235
14.3 Option Pricing By The Black-Scholes Methodology . . . . . . . . . . . . 237
14.4 A Brief Digression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
14.5 The Continuous-Time Riskless Hedge . . . . . . . . . . . . . . . . . . . . 239
14.6 The Option’s Price Dynamics . . . . . . . . . . . . . . . . . . . . . . . . 241
x CONTENTS
14.7 The Hedging Portfolio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
14.8 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
14.9 The Black-Scholes Put Pricing Formula . . . . . . . . . . . . . . . . . . . 244
14.10The Black-Scholes Model as the Limit of the Binomial Model . . . . . . . 246
14.11Preference-Free Pricing: The Cox-Ross-Merton Technique . . . . . . . . . 247
14.12More on General Distribution-Free Properties of Options . . . . . . . . . . 248
15 Review of Multiperiod Models 252
15.1 The Martingale Pricing Process for a Complete Market . . . . . . . . . . . 252
15.2 The Martingale Process for the Continuous-Time CAPM . . . . . . . . . . 253
15.3 A Consumption-Based Asset-Pricing Model . . . . . . . . . . . . . . . . 254
15.4 The Martingale Measure When The Opportunity Set Is Stochastic . . . . . 256
15.5 A Comparison of the Continuous-Time and Complete Market Models . . . 257
15.6 Further Comparisons Between the Continuous-Time and Complete Market
Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
15.7 More on the Consumption-Based Asset-Pricing Model . . . . . . . . . . . 261
15.8 Models With State-Dependent Utility of Consumption . . . . . . . . . . . 263
15.9 Discrete-Time Utility-Based Option Models . . . . . . . . . . . . . . . . 263
15.10Returns Distributions in the Intertemporal Asset Model . . . . . . . . . . . 265

16 An Introduction to Stochastic Calculus 267
16.1 Diffusion Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
16.2 Ito’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
16.3 Properties of Wiener Processes . . . . . . . . . . . . . . . . . . . . . . . 268
16.4 Derivation of Ito’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . 268
16.5 Multidimensional Ito’s Lemma . . . . . . . . . . . . . . . . . . . . . . . 269
16.6 Forward and Backward Equations of Motion . . . . . . . . . . . . . . . . 269
16.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
16.8 First Passage Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
16.9 Maximum and Minimum of Diffusion Processes . . . . . . . . . . . . . . 273
16.10Diffusion Processes as Subordinated Wiener Processes . . . . . . . . . . . 273
16.11Extreme Variation of Diffusion Processes . . . . . . . . . . . . . . . . . . 274
16.12Statistical Estimation of Diffusion Processes . . . . . . . . . . . . . . . . 275
17 Advanced Topics in Option Pricing 279
17.1 An Alternative Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . 279
17.2 A Reexamination of The Hedging Derivation . . . . . . . . . . . . . . . . 280
17.3 The Option Equation: A Probabilistic Interpretation . . . . . . . . . . . . 281
17.4 Options With Arbitrary Payoffs . . . . . . . . . . . . . . . . . . . . . . . 282
17.5 Option Pricing With Dividends . . . . . . . . . . . . . . . . . . . . . . . 282
17.6 Options with Payoffs at Random Times . . . . . . . . . . . . . . . . . . . 285
17.7 Option Pricing Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 287
17.8 Perpetual Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
17.9 Options with Optimal Early Exercise . . . . . . . . . . . . . . . . . . . . 289
17.10Options with Path-Dependent Values . . . . . . . . . . . . . . . . . . . . 291
17.11Option Claims on More Than One Asset . . . . . . . . . . . . . . . . . . 294
17.12Option Claims on Nonprice Variables . . . . . . . . . . . . . . . . . . . . 295
17.13Permitted Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . 297
CONTENTS xi
17.14Arbitrage “Doubling” Strategies in Continuous Time . . . . . . . . . . . . 298
18 The Term Structure of Interest Rates 300

18.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
18.2 The Term Structure in a Certain Economy . . . . . . . . . . . . . . . . . 301
18.3 The Expectations Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . 302
18.4 A Simple Model of the Yield Curve . . . . . . . . . . . . . . . . . . . . . 304
18.5 Term Structure Notation in Continuous Time . . . . . . . . . . . . . . . . 305
18.6 Term Structure Modeling in Continuous Time . . . . . . . . . . . . . . . 306
18.7 Some Simple Continuous-Time Models . . . . . . . . . . . . . . . . . . . 307
18.8 Permissible Equilibrium Specifications . . . . . . . . . . . . . . . . . . . 309
18.9 Liquidity Preference and Preferred Habitats . . . . . . . . . . . . . . . . . 311
18.10Determinants of the Interest Rate . . . . . . . . . . . . . . . . . . . . . . 314
18.11Bond Pricing with Multiple State Variables . . . . . . . . . . . . . . . . . 315
19 Pricing the Capital Structure of the Firm 318
19.1 The Modigliani-Miller Irrelevancy Theorem . . . . . . . . . . . . . . . . 318
19.2 Failure of the M-M Theorem . . . . . . . . . . . . . . . . . . . . . . . . 320
19.3 Pricing the Capital Structure: An Introduction . . . . . . . . . . . . . . . 321
19.4 Warrants and Rights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
19.5 Risky Discount Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
19.6 The Risk Structure of Interest Rates . . . . . . . . . . . . . . . . . . . . . 326
19.7 The Weighted Average Cost of Capital . . . . . . . . . . . . . . . . . . . 329
19.8 Subordinated Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
19.9 Subordination and Absolute Priority . . . . . . . . . . . . . . . . . . . . 331
19.10Secured Debt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
19.11Convertible Securities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
19.12Callable Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
19.13Optimal Sequential Exercise: Externalities and Monopoly Power . . . . . 337
19.14Optimal Sequential Exercise: Competitive and Block Strategies . . . . . . 340
19.15Sequential and Block Exercise: An Example . . . . . . . . . . . . . . . . 343
19.16Pricing Corporate Securities with Interest Rate Risk . . . . . . . . . . . . 345
19.17Contingent Contracting . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Mathematical Introduction

0.1 Definitions and Notation
Unless otherwise noted, all quantities represent real values. In this book derivatives are
often denoted in the usual fashion by

,

and so forth. Higher-order derivatives are denoted
by f
(n)
for the nth derivative. Partial derivatives are often denoted by subscripts. For
example,
F
1
(x, y) ≡ F
x
(x, y) ≡
∂F (x, y)
∂x
,
F
12
(x, y) ≡ F
xy
(x, y) ≡
∂F (x, y)
∂x∂y
.
(1)
Closed intervals are denoted by brackets, open intervals by parentheses. For example,
x ∈ [a, b] means all x such that a  x  b,

x ∈ [a, b) means all x such that a  x < b,
(2)
The greatest and least values of a set are denoted by max(·) and min(·), respectively.
For example, if x > y, then
min(x, y) = y and max(x, y) = x. (3)
The relative importance of terms is denoted by the asymptotic order symbols:
f(x) = o(x
n
) means lim
x→0
f(x)
x
n
= 0;
f(x) = O(x
n
) means lim
x→0
f(x)
x
n+ε
= 0 for all ε > 0.
(4)
Dirac delta function
The Dirac delta function δ(x) is defined by its properties:
δ(x) =

0, x = 0,
∞, x = 0,


a
−a
δ(x)dx = 1 for any a > 0.
(5)
The delta function may be considered as the limit of a mean zero density function as the
dispersion goes to zero. For example, for the normal density
δ(x) = lim
σ→0
(2πσ
2
)
−1/2
exp

−x
2

2

. (6)
In the limit all the probability mass is concentrated at the origin, but the total mass is still
unity.
2 CONTENTS
The Dirac delta function is most often used in formal mathematical manipulations. The
following property is useful:

b
a
δ(x − x
0

)f(x)dx = f (x
0
) if a  x
0
 b. (7)
Unit step function
The unit step function is the formal integral of the Dirac delta function and is given by
u(x) =







1, x > 0,
1
2
, x = 0,
0, x < 0.
(8)
Taylor Series
If f and all its derivatives exist in the region [x, x + h], then
f(x + h) = f(x) + f

(x)h +
1
2
f


(x)h
2
+ ···+
1
n!
f
(n)
(x)h
n
+ ··· . (9)
If f and all its derivatives up to order n exist in the region [x, x + h], then it can be repre-
sented by a Taylor series with Lagrange remainder
f(x + h) = f(x) + f

(x)h + ···+
1
(n − 1)!
f
(n−1)
(x)h
n−1
+
1
n!
f
(n)
(x

)h
n

, (10)
where x

is in [x, x +h]. For a function of two or more arguments the extension is obvious:
F (x + h, y + k) = F(x, y) + F
1
(x, y)h + F
2
(x, y)k
+
1
2
F
11
(x, y)h
2
+
1
2
F
22
(x, y)k
2
+ F
12
(x, y)hk
+ ··· +
1
n!


h

∂x
+ k

∂y

n
F (x, y) + ··· .
(11)
Mean Value Theorem
The mean value theorem is simply the two-term form of the exact Taylor series with La-
grange remainder:
f(x + h) = f(x) + f

(x + αh)h (12)
for some α in [0, 1]. The mean value theorem is also often stated in integral form. If f(x)
is a continuous function in (a, b), then

b
a
f(x)dx = (b − a)f(x

) (13)
for some x

in (a, b).
0.1 Definitions and Notation 3
Implicit Function Theorem
Consider all points (x, y) on the curve with F(x, y) = a. Along this curve the derivative

of y with respect to x is
dy
dx




F =a
= −
∂F/∂x
∂F/∂y
. (14)
To see this, note that
dF =
∂F
∂x
dx +
∂F
∂y
dy.
Setting dF = 0 and solving for dy/dx gives the desired result.
Differentiation of Integrals: Leibniz’s Rule
Let F (x) ≡

B(x)
A(x)
f(x, t)dt and assume that f and ∂f/∂x are continuous in t in [A, B]
and x in [a, b]. Then
F


(x) =

B
A
f
1
(x, t)dt + f (x, B)B

(x) − f(x, A)A

(x) (15)
for all x in [a, b]. If F (x) is defined by an improper integral (A = −∞ and/ or B = ∞),
then Leibniz’s rule can be employed if |f
2
(x, t)|  M(t) for all x in [a, b ] and all t in
[A, B], and the integral

M(t)dt converges in [A, B].
Homotheticity and Homogeneity
A function F (x) of a vector x is said to be homogeneous of degree k to the point x
0
if for
all λ = 0
F (λ(x − x
0
)) = λ
k
F (x − x
0
). (16)

If no reference is made to the point of homogeneity, it is generally assumed to be 0. For
k = 1 the function is said to be linearly homogeneous. This does not, of course, imply that
F (·) is linear.
All partial derivatives of a homogeneous function are homogeneous of one smaller de-
gree. That is, let f(x) ≡ ∂F (x)/∂x
i
. Then f(λx) = λ
k−1
f(x). To prove this, take the
partial derivative of both sides of (16) with respect to x
1
:
∂F (λx)
∂x
i
= λF
i
(λx) and
∂λ
k
F (x)
∂x
i
= λ
k
F
i
(x). (17)
Then
f(λx) = λ

k−1
f(x).
Similarly, all nth-order partial derivatives of F (·) are homogeneous of degree k −n.
Euler’s theorem states that the following condition is satisfied by homogeneous func-
tions:

x
i
∂F (x)
∂x
i
= kF(x). (18)
To prove (18), differentiate (16) with respect to λ:

∂λ
F (λx) =

x
i
∂F (λx)
∂x
i
= kλ
k−1
F (x). (19)
Now substitute λ = 1.
4 CONTENTS
A function F (x) is said to be homothetic if it can be written as
F (x) = h(g(x)), (20)
where g is homogeneous and h is continuous, nondecreasing, and positive.

0.2 Matrices and Linear Algebra
It is assumed that the reader is familiar with the basic notions of matrix manipulations. We
write vectors as boldface lowercase letters and matrices as boldface uppercase letters. I
denotes the identity matrix. 0 denotes the null vector or null matrix as appropriate. 1 is
a vector whose elements are unity. ι
n
is a vector with zeros for all elements except the
nth, which is unity. Transposes are denoted by

. Vectors are, unless otherwise specified,
column vectors, Transposed vectors are row vectors. The inverse of a square matrix A is
denoted by A
−1
.
Some of the more advanced matrix operations that will be useful are outlined next.
Vector Equalities and Inequalities
Two vectors are equal, x = z, if every pair of components is equal: x
i
= z
i
. Two vectors
cannot be equal unless they have the same dimension. We adopt the following inequality
conventions:
x  z if x
i
 z
i
for all i,
x  z if x
i

 z
i
for all i and x
i
> z
i
for some i,
x > z if x
i
> z
i
for all i.
(21)
For these three cases x − z is said to be nonnegative, semipositive, and positive, respec-
tively.
Orthogonal Matrices
A square matrix A is orthogonal if
A

A = AA

= I. (22)
The vectors making up the rows (or columns) of an orthogonal matrix form an orthonormal
set. That is, if we denote the ith row (column) by a
i
, then
a

i
a

j
=

1, i = j,
0, i = j.
(23)
Each vector is normalized to have unit length and is orthogonal to all others.
Generalized (Conditional) Inverses
Only nonsingular (square) matrices possess inverses; however, all matrices have gener-
alized or conditional inverses. The conditional inverse A
c
of a matrix A is any matrix
satisfying
AA
c
A = A. (24)
If A is m × n, then A
c
is n × m. The conditional inverse of a matrix always exists, but it
need not be unique. If the matrix is nonsingular, then A
−1
is a conditional inverse.
0.2 Matrices and Linear Algebra 5
The Moore-Penrose generalized inverse A

is a conditional inverse satisfying the addi-
tional conditions
A

AA


= A

, (25)
and both A

A and AA

are symmetric. The Moore-Penrose inverse of any matrix exists
and is unique. These inverses have the following properties:
(A

)

= (A

)

, (26a)
(A

)

= A, (26b)
rank(A

) = rank(A), (26c)
(A

A)


= A

A
−
, (26d)
(AA

)

= AA

, (26e)
(A

A)

= A

A. (26f)
Also AA
−1
, A

A, I − AA

, and I − A

A are all symmetric and idempotent.
If A is an m ×n matrix of rank m, then A


= A

(AA

)
−1
is the right inverse of A (i.e.,
AA

= I). Similarly, if the rank of A is n, then A

= (A

A)
−1
A

is the left inverse.
Vector and Matrix Norms
A norm is a single nonnegative number assigned to a matrix or vector as a measure of
its magnitude. It is similar to the absolute value of a real number or the modulus of a
complex number. A convenient, and the most common, vector norm is the Euclidean norm
(or length) of the vector x:
x ≡

x

x ≡



x
2
i

1/2
. (27)
For nonnegative vectors the linear norm is also often used
L(x) ≡ 1

x ≡

x
i
. (28)
The Euclidean norm of a matrix is defined similarly:
A
E



a
2
ij

1/2
. (29)
The Euclidean matrix norm should not be confused with the spectral norm, which is induced
by the Euclidean vector norm
A ≡ sup

x=0
Ax
x
. (30)
The spectral norm is the largest eigenvalue of A

A.
Other types of norms are possible. For example, the H
¨
older norm is defined as
h
n
(x) ≡


|x
i
|
n

1/n
, 1  n, (31)
and similarly for matrices, with the additional requirement that n  2. ρ(x) ≡ max |x
i
|
and M(A) ≡ max |a
ij
| for an n × n matrix are also norms.
All norms have the following properties (A denotes a vector or matrix):
A  0, (32a)

6 CONTENTS
A = 0 iff A = 0, (32b)
cA = |c|A, (32c)
A + B  A + B, (32d)
AB  AB, (32e)
A − B  |A − B|, (32f)
A
−1
  A
−1
(square matrices). (32g)
Properties (32d) and (32f) apply only to matrices or vectors of the same order: (32d)
is known as the triangle inequality; (32f) shows that norms are smooth functions. That is,
whenever |A − B| < ε, then A and B are similar in the sense that |a
ij
− b
ij
| < δ for
all i and j.
Vector Differentiation
Let f(x) be a function of a vector x. Then the gradient of f is
∇f ≡
∂f
∂x
=

∂f
∂x
1
,

∂f
∂x
2
, . . . ,
∂f
∂x
n


. (33)
The Hessian matrix is the n × n matrix of second partial derivatives
Hf ≡

2
f
∂x∂x

=







2
f
∂x
2
1


2
f
∂x
1
∂x
2
···

2
f
∂x
1
∂x
n

2
f
∂x
2
∂x
1

2
f
∂x
2
2
···


2
f
∂x
2
∂x
n
.
.
.
.
.
.

2
f
∂x
n
∂x
1
···

2
f
∂x
2
n







. (34)
The derivative of the linear form a

x is
∂(a

x)
∂x
= a. (35)
The derivatives of the quadratic form x

Ax are
∂(x

Ax)
∂x
= (A + A

)x,

2
(x

Ax)
∂xx

= (A + A


).
(36)
Note that if A is symmetric, then the above look like the standard results from calculus:
∂ax
2
/∂x = 2ax, ∂
2
ax
2
/∂x
2
= 2a.
0.3 Constrained Optimization
The conditions for an unconstrained strong local maximum of a function of several variables
are that the gradient vector and Hessian matrix with respect to the decision variables be zero
and negative definite:
∇f = 0, z

(Hf)z < 0 all nonzero z. (37)
(For an unconstrained strong local minimum, the Hessian matrix must be positive definite.)
0.3 Constrained Optimization 7
The Method of Lagrange
For maximization (or minimization) of a function subject to an equality constraint, we use
Lagrangian methods. For example, to solve the problem max f(x) subject to g(x) = a, we
define the Lagrangian
L(x, λ) ≡ f(x) − λ(g(x) − a) (38)
and maximize with respect to x and λ:
∇f − λ∇g = 0, g(x) − a = 0. (39)
The solution to (39) gives for x


the maximizing arguments and for λ

the marginal cost of
the constraint. That is,
df(x

)
da
=

∂f(x

)
∂x
i
dx

i
da
= λ. (40)
The second-order condition for this constrained optimization can be stated with the bor-
dered Hessian
H
B


Hf ∇g
(∇g)

0


. (41)
For a maximum the bordered Hessian must be negative semidefInite For a minimum the
bordered Hessian must be positive semidefinite.
For multiple constraints g
i
(x) = a
i
which are functionally independent, the first- and
second-order conditions for a maximum are
∇f −

λ
i
∇g
i
= 0, (42a)
g
i
(x) = a
i
all i, (42b)
z





Hf ∇g
1

··· ∇g
k
(∇g
1
)

.
.
. 0
(∇g
k
)





z  0 all z. (42c)
The Method of Kuhn and Tucker
For maximization (or minimization) subject to an inequality constraint on the variables,
we use the method of Kuhn-Tucker. For example, the solution to the problem max f(x)
subject to x  x
0
is
∇f  0, (x − x
0
)

∇f = 0 (43)
For a functional inequality constraint we combine the methods, For example, to solve

the problem max f(x) subject to g(x) ≥ a, we pose the equivalent problem max f(x)
subject to g(x) = b, b ≥ a. Form the Lagrangian
L(x, λ, b) ≡ f(x) − λ(g(x) − b), (44)
and use the Kuhn-Tucker method to get
∇L = ∇f − λ∇g = 0, (45a)
∂L
∂λ
= −g(x) + b = 0, (45b)
∂L
∂b
= λ ≤ 0, (45c)
8 CONTENTS
(b − a)λ = 0. (45d)
Equations (45a), (45b) correspond to the Lagrangian problem in (39), Equations (45c),
(45d) correspond to the Kuhn-Tucker problem in (43) with b as the variable.
Linear Programming
One very common type of maximization problem is the linear program. All linear pro-
gramming problems may be written as in (46). The methods for converting problems to
this form are given in any standard linear programming text.
Min p = c

1
x
1
+ c

2
x
2
, (46a)

Subject to A
11
x
1
+ A
12
x
2
 b
1
, (46b)
A
21
x
1
+ A
22
x
2
= b
2
, (46c)
x
1
 0. (46d)
Here x
1
and x
2
are vectors of n and m control variables. The first set must be nonnegative

from (46d). Equation (46a) is the objective function. A
11
, A
12
, and b
1
all have r rows, and
(46b) represents inequality constraints. A
21
, A
22
, and b
2
have q rows, and (46c) represents
equality constraints.
Associated with the problem in (46) is the dual program
Max d = b

1
z
1
+ b

2
z
2
, (47a)
Subject to A

11

z
1
+ A

21
z
2
 c
1
, (47b)
A

12
z
1
+ A

22
z
2
= c
2
, (47c)
z
1
 0. (47d)
In this case z
1
and z
2

are vectors of r and q elements, respectively. For each of the n non-
negative controls in the primal (x
1
) in (46d), there is one inequality constraint (47b) in the
dual. For each of the m unconstrained controls (x
2
) there is one equality constraint (47c).
The r inequality constraints in the primal (46b) correspond to the r nonnegative controls
in the dual z
1
, and the q equality constraints (46c) correspond to the q unconstrained dual
controls z
2
.
We state without proof the following theorems of duality.
Theorem 1 For the primal and dual problems in (46) and (47), one of the following four
cases is true:
(i) Both the primal and dual problems are infeasible (the constraints (46b)-(d) and (47b)-
(d) cannot be simultaneously met).
(ii) The primal is feasible and the dual is infeasible. p = −∞, and one or more elements
of x

are unbounded.
(iii) The dual is feasible and the primal is infeasible. d = ∞ and one or more elements of
z

are unbounded.
(iv) Both primal and dual are feasible. p = d and |p| < ∞. The optimal vectors have all
finite elements.
0.4 Probability 9

Theorem 2 For any feasible x and z, p(x) − d(z) ≥ 0
Theorem 3 (of complementary slackness) For the optimal controls, either x

1i
= 0 or the
ith row of (47b) is exactly equal. Also, either z

1i
= 0 or the ith row of (46b) is exactly
equal.
Each dual variable can be interpreted as the shadow price of the associated constraint in
the primal. That is, the optimal value of a dual variable gives the change in the objective
function for a unit increase in the right-hand side constraint (provided that the optimal basis
does not change).
0.4 Probability
Central and Noncentral Moments
The moments of a random variable ˜x about some value a are defined as
µ

n
(a) ≡


−∞
(x − a)
n
f(x)dx (48)
provided the integral exists. If a is taken as the mean of ˜x, then the moments are called
central moments and are conventionally written without the prime. For any other a the
moments are noncentral moments. The most common alternative is a = 0.

Central and noncentral moments are related by
µ

n
=
n

i=0

n
i

µ
n−i


1
)
i
, µ
n
=
n

i=0

n
i

µ


n−i
(−µ

1
)
i
. (49)
Characteristic Function and Related Functions
The characteristic function for the density function f(x) is
φ(t) ≡ E[e
it˜x
] ≡


−∞
e
itx
f(x)dx where i ≡

−1. (50)
The nth noncentral moment of f(x) about the origin is related to the characteristic function
by
µ

n
≡ E[˜x
n
] = i
−n

φ
(n)
(0). (51)
This property is easily verified by differentiating (50). Using Leibniz’s rule (15) gives

n
φ(t)
∂t
n
=


−∞
(ix)
n
e
itx
f(x)dx,

n
φ(0)
∂t
n
= i
n


−∞
x
n

f(x)dx = i
n
µ

n
.
(52)
The moment generating function
M(q) ≡


−∞
e
qx
f(x)dx = φ(−iq) (53)
is a related real-valued function which is also useful in determining the moments of a dis-
tribution, M
(n)
(0) = µ

n
. The characteristic function of a distribution is always defined and
10 CONTENTS
uniquely determines the distribution. The moment generating function is undefined if the
integral in (53) diverges. It exists if and only if moments of all orders are finite.
For strictly nonnegative random variables, the Laplace transform L(r) = M(−r) =
φ(ir) is often used. The Laplace transform is defined for all piecewise continuous density
functions of nonnegative random variables.
Chebyshev’s Inequality
If the mean and variance, µ and σ

2
, of a distribution exist, then for all t > 0,
Pr[|˜x − µ|  tσ]  t
−2
. (54)
Normal Density
If x is a normally distributed random variable with mean µ and variance σ
2
its density
function and characteristic function are
f(x) = (2πσ
2
)
−1/2
exp


(x − µ)
2

2

,
φ(t) = exp

iµt −
σ
2
t
2

2

.
(55)
The higher central moments of a normal random variable are
µ
n
≡ E [(˜x − µ)
n
] =

0, n odd,
(n − 1) (n − 3) ···3 · 1 · σ
n
, n even.
(56)
If ˜x is a normal random variable with mean zero and variance one, it is said to be a standard
normal deviate. The density and distribution functions of a standard normal deviate are
often denoted by n(x) and N(x).
Probability Limit Theorems
Let ˜x
i
, represent a sequence of independent random variables. If the ˜x
i
are identically
distributed and have finite expectation µ, then for any constant ε > 0,
lim
n→∞
Pr







1
n
n

i=1
˜x
i
− µ





< ε

= 1. (57)
This relation can also be expressed as
plim

1
n
n

i=1
˜x

i

= µ. (58)
If the ˜x
i
are not identically distributed, but each has finite expectation µ
i
and finite
variance, then a similar relation holds:
plim

1
n

n

i=1
˜x
i

n

i=1
µ
i

= 0. (59)
These results are two different forms of the weak law of large numbers, If the ˜x
i
are iden-

tically distributed with common finite expectation µ and variance σ
2
then the central limit
0.4 Probability 11
theorem applies. This theorem indicates that the approach to the limits above is asymptoti-
cally normal; that is, the sample mean of the ˜x
i
, is approximately normal with mean µ and
variance σ
2
/n, or
lim
n→∞
Pr

a <

n
σ

1
n
n

i=1
˜x
i
− µ

< b


= N(b) − N(a). (60)
Bivariate Normal Variables
Let ˜x
1
and ˜x
2
be bivariate normal random variables with means µ
i
, variances σ
2
i
, and co-
variance σ
12
. The weighted sum w
1
˜x
1
+ w
2
˜x
2
is normal with mean and variance
µ = w
1
µ
1
+ w
2

µ
2
, σ
2
= w
2
1
+ σ
2
1
+ 2w
1
w
2
σ
12
+ w
2
2
+ σ
2
2
. (61)
For such ˜x
1
and ˜x
2
and for a differentiable function h(x),
Cov[˜x
1

, h(˜x
2
)] = E[h

(˜x
2
)]σ
12
. (62)
This property can be proved as follows. If ˜x
1
and ˜x
2
are bivariate normals, then from
our understanding of regression relationships we may write
˜x
1
= a + b˜x
2
+ ˜e, (63)
where b ≡ σ
12

2
2
and ˜e is independent of ˜x
2
. Therefore
Cov[˜x
1

, h(˜x
2
)] = Cov[a + b˜x
2
+ ˜e, h(˜x
2
)]
= b Cov[˜x
2
, h(˜x
2
)]
= E[(˜x
2
− µ
2
)h(˜x
2
)]
= b


−∞
(x
2
− µ
2
)h(x
2
)f(x

2
)dx
2
,
(64)
where f(x
2
) is the univariate normal density defined in (55). Now
f(x
2
)
dx
2
= −
x
2
− µ
2
σ
2
2
f(x
2
), (65)
so the last line in (64) may be rewritten as
Cov[˜x
1
, h(˜x
2
)] = −bσ

2
2


−∞
h(x
2
)df(x
2
)
= −bσ
2
2
h(x
2
)f(x
2
)



−∞
+ bσ
2
2


−∞
h


(x
2
)f(x
2
)dx
2
(66)
upon integrating by parts. Then if h(x
2
) = o(exp(x
2
2
)), the first term vanishes at both
limits, and the remaining term is just E[h

(˜x
2
)]σ
12
.
Lognormal Variables
If ˜x is normally distributed, then ˜z ≡ e
˜x
is said to be lognormal. The lognormal density
function is
f(z) = (

2πσz)
−1
exp



(ln z − µ)
2

2

. (67)
12 CONTENTS
The moments are
µ

n
= exp

nµ +
n
2
σ
2
2

,
¯z = exp

µ +
σ
2
2


,
Var(z) = exp(2µ + σ
2
)(exp(σ
2
) − 1).
(68)
The values of µ

n
can be derived by setting it = n in the characteristic function for a normal
random variable (55). A quantity that is very useful to know in some financial models is
the truncated mean E(z; z > a):


a
zf(z)dz = exp

µ +
σ
2
2

N

µ − ln a
σ
+ σ

. (69)

To verify (69), make the substitutions z = e
x
and dz = zdx to obtain


a
zf(z)dz = (2πσ
2
)
1/2


ln a
e
x
exp

−(x − µ)
2

2

dx
= (2πσ
2
)
1/2


ln a

e
x
exp

−(x − (µ + σ
2
))
2

2

× exp

2µσ
2
+ σ
4

2

dx
= exp

µ +
σ
2
2




ln a
n

x − µ − σ
2
σ

dx.
(70)
Evaluating the integral confirms (69).
“Fair Games”
If the conditional mean of one random variable does not depend on the realization of an-
other, then the first random variable is said to be conditionally independent of the second.
That is, ˜x is conditionally independent of ˜y if E[˜x|y] = E[˜x] for all realizations y. If, in
addition, the first random variable has a zero mean, then it is said to be noise or a fair game
with respect to the second, E[˜x|y] = 0.
The name “conditional independence” is applied because this statistical property is in-
termediate between independence and zero correlation, as we now show. ˜x and ˜y are un-
correlated if Cov(˜x, ˜y) = 0; they are independent if Cov(f(˜x), g(˜y)) = 0 for all pairs of
functions f and g. Under mild regularity conditions ˜x is conditionally independent of ˜y if
and only if Cov[˜x, g(˜y)] = 0 for all functions g(˜y).
To prove necessity, assume E[˜x|y] = ¯x. Then
Cov[˜x, g(˜y)] = E[˜xg(˜y)] − E[g(˜y )]E[˜x]
= E[E[˜xg(˜y)|y]] − E[g(˜y)]¯x
= E[g(˜y)E[˜x|y]] − E[g(˜y)]¯x = 0.
(71)
The second line follows from the law of iterated conditional expectations. In the third line
g(y) can be removed from the expectation conditional on y, and the last equality follows
from the assumption that the conditional expectation of x is independent of y.
In proving sufficiency we assume that ˜y is a discrete random variable taking on n out-

comes y
i
, with probabilities π
i
. Define the conditional excess mean m(y) ≡ E[˜x|y]−E[˜x].
0.4 Probability 13
Then for any function g (y),
Cov[˜x, g(˜y)] = E[(˜x − ¯x)g(˜y)]
= E[E(˜x − ¯x|y)g(˜y)]
=
n

i=1
π
i
m(y
i
)g(y
i
) = 0
(72)
by assumption. Now consider the set of n functions defined as g(y; k) = i
k
when y = y
i
for k = 1, . . . , n. For these n functions the last line in (72) may be written as
ΠGm = 0, (73)
where Π is a diagonal matrix with Π
ii
= π

i
, G is a matrix with G
ki
= i
k
, and m is a vector
with m
i
= m(y
i
). Since Π is diagonal (and π
i
> 0), it is nonsingular. G is nonsingular by
construction. Thus the only solution to (73) is m = 0 or E(˜x|y) = E(˜x).
If y is continuous, then under mild regularity conditions the same result is true. Note that
the functions used were all monotone. Thus a stronger sufficient condition for conditional
independence is that Cov[x, g(y)] = 0 for all increasing functions.
A special case of this theorem is ˜x is a fair game with respect to ˜y if and only if
E(˜x) = 0 and Cov[˜x, g(˜y)] = 0 for all functions g(y). Also equivalent is the statement
that E[˜xg(˜y)] = 0 for all functions g(y).
Jensen’s Inequality
If ˜x is a random variable with positive dispersion and density function f(x) and G is a
concave function of x, G

(x) < 0, then
E[G(x)] < G[E(x)]. (74)
To prove this inequality, we use Taylor’s series (10) to write
G(x) = G(¯x) + (x − ¯x)G

(¯x) +

1
2
(x − ¯x)
2
G

(x

(x)). (75)
Then
E[G(x)] =

G(x)f(x)dx
= G(¯x)

f(x)dx + G

(¯x)

(x − ¯x)f(x)dx +
1
2

G

(x

(x))(x − ¯x)
2
f(x)dx

= G(¯x) +
1
2

(x − ¯x)f(x)dx +
1
2

G

(x

(x))(x − ¯x)
2
f(x)dx
< G(¯x).
(76)
The last line follows since the integrand is uniformly negative.
Stochastic Processes
A stochastic process is a time series of random variables
˜
X
0
,
˜
X
1
, . . . ,
˜
X

N
with realiza-
tions x
0
, x
1
, . . . , X
N
. Usually the random variables in a stochastic process are related in

×