Tải bản đầy đủ (.pdf) (48 trang)

Handbook of Industrial Automation - Richard L. Shell and Ernest L. Hall Part 2 ppsx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (517.13 KB, 48 trang )

2.2.4 Equivalence Relations
Now we concentrate our attention on the properties of
a binary relation ` de®ned in a set X.
1. ` is called re¯exive in X, if and only if, for all
x P X, x`x.
2. ` is called symmetrical in X, if and only if, for
all x; y P X, x`y implies y`x.
3. ` is called transitive in X, if and only if, for all
x; y; z P X, x`y and y`z implies x`z.
A binary relation ` is called an equivalence relation
on X if it is re¯exive, symmetrical and transitive.
As an example, consider the set Z of integer num-
bers and let n be an arbitrary positive integer. The
congruence relation modulo n on the set Z is de®ned
by x  y (modulo n) if and only if x Ày  kn for some
k P Z. The congruence relation is an equivalence rela-
tion on Z.
Proof
1. For each x P Z, x À x  0n. This means that x 
x (modulo n) which implies that the congruence
relation is re¯exive.
2. If x  y (modulo n), x Ày  kn for some k P Z.
Multiplying both sides of the last equality by À1,
we get y À x Àkn which implies that y  x
(modulo n). Thus, the congruence relation is
symmetrical.
3. If x  y (modulo n) and y  z (modulo n), we
have x Ày  k
1
n and y À z  k
2


n for some k
1
and k
2
in Z. Writing x À z  x À y  y À z,we
get x À z k
1
 k
2
n. Since k
1
 k
2
P Z,we
conclude that x  z (modulo n). This shows
that the congruence relation is transitive.
From 1±3 it follows that the congruence relation
(modulo n) is an equivalence relation on the set Z of
integer numbers.
In particular, we observe that if we choose n  2,
then x  y (modulo 2) means that x À y  2k for some
integer k. This is equivalent to saying that either x and
y are both even or both x and y are odd. In other
words, any two even integers are equivalent, any two
odd integers are equivalent but an even integer can not
be equivalent to an odd one. The set Z has been
divided into two disjoint subsets whose union gives
Z. One such proper subset is the set of even integers
and the other one is the set of odd integers.
2.2.4.1 Partitions and Equivalence Relations

The situation described in the last example is quite
general. To study equivalence relations in more detail
we need to introduce the concepts of partition and
equivalence class.
Given a nonempty set X,apartition S of X is a
collection of nonempty subsets of X such that
1. If A; B P S; A T B, then A  B Y.
2.

APS
A  X.
If ` is an equivalence relation on a nonempty set X,
for each member x P X the equivalence class associated
with x, denoted x=`, is given by
x=`fz P X X x`xg
The set x=` is a subset of X and, consequently, an
element of the power set PX. Thus, the set
X=`fy P Px X y  x=` for some y P Xg
is also a well-de®ned subset of Px called the quotient
set of X by `.
The correspondence between the partition of a
nonempty set and the equivalence relation determined
by it is established in the following propositions.
The quotient set x=` of a set X by an equivalence
relation ` is a partition of the set X.
The converse of this statement also holds; that is,
each partition of X generates an equivalence relation
on X. In fact, if S is a partition of a nonempty set X,
we can de®ne the relation
X=S fx; yPX Â X X x P s and y P s for some

s P Sg
This is an equivalence relation on X, and the equiva-
lent classes induced by it are precisely the elements of
the partition S, i.e.,
X=X=SS
Intuitively, equivalence relations and partitions are
two different ways to describe the same collection of
subsets.
2.2.5 Order Relations
Order relations constitute another common type of
relations. Once again, we begin by introducing several
de®nitions.
A binary relation ` in X is said to be antisymme-
trical if for all x; y P X; x`y and y`x imply
x  y.
40 Murio
Copyright © 2000 Marcel Dekker, Inc.
A binary relation ` in X is asymmetrical if for any
x; y P X; x`y implies that y`x does not hold. In
other words, we can not have x`y an y`x both
true.
A binary relation ` in X is a partial ordering of X if
and only if it is re¯exive, antisymmetrical, and
transitive. The pair X; ` is called and ordered
set.
A binary relation in X is a strict (or total) ordering
of X if and only if it is asymmetrical and transi-
tive.
For example, consider the set of integers
X f1; 3; 2g

and the binary relation in X given by
`
1
fx; y X x; y P X and x yg
This gives explicitly
`
1
f1; 1; 2; 2; 3; 3; 1; 2; 1; 3; 2; 3g
It is a simple task to check that `
1
is a partial ordering
of the set X. It requires a little extra thinking to realize
that now the least and the greatest elements of X have
been identi®ed.
On the same set X, the binary relation de®ned by
`
1
fx; y X x; y P X and x < yg
f1; 2; 1; 3; 2; 3g
is an example of a strict ordering of X.
It is also possible to establish a correspondence
between partial orderings and strict orderings of a set:
If `
1
is a partial ordering of X, then the binary
relation `
2
de®ned in X by
x`
2

y if and only if x`
1
y and x T y
is a strict ordering of X.
Finally, if `
2
is a strict ordering of X, then the
relation `
1
de®ned in X by
x`
1
y if and only if x`
2
y or x  y
is a partial ordering of X.
GENERAL REFERENCES
1. PR Halmos. Naive Set Theory. New York: Van
Nostrand Re
Â
inhold, 1960.
2. K Hrbacek, T Jech. Introduction to Set Theory. New
York: Marcel Dekker, 1978.
Introduction to Sets and Relations 41
Copyright © 2000 Marcel Dekker, Inc.
Chapter 1.3
Linear Algebra
William C. Brown
Michigan State University, East Lansing, Michigan
3.1 MATRICES

3.1.1 Shapes and Sizes
Throughout this chapter, F will denote a ®eld. The
four most commonly used ®elds in linear algebra are
Q  rationals, R  reals, C  complex numbers and
Z
p
 the integers modulo a prime p.Wewillalsolet
N 1; 2; FFF, the set of natural numbers.
De®nition 1. Let m; n  N.Anm n matrix A with
entries from F is a rectangular array of m rows and n
columns of numbers from F .
The most common notation used to represent an m
 n (read ``m by n'') matrix A is displayed in Eq. (1):
A 
a
11
; a
12
; FFF; a
1n
F
F
F
F
F
F
a
m1
; a
m2

; FFF; a
mn
H
f
d
I
g
e
1
If A is the m  n matrix displayed in Eq. (1), then the
®eld elements a
ij
i  1; FFF; mY j  1; FFF; n are called
the entries of A. We will also use A
ij
to denote the
i; jth entry of A. Thus, a
ij
A
ij
is the element of F
which lies in the ith row and jth column of A. By the
size of A, we will mean the expression m n. Thus, size
Am  n if A has m rows and n columns. Notice
that the size of a matrix is a pair of positive integers
with a ``'' put between them. Negative numbers and
zero are not allowed to appear in the size of a matrix.
De®nition 2. The set of all m  n matrices with entries
from F will be denoted by M
mn

F.
Matrices of various shapes are given special names
in linear algebra. Here is a brief list of some of the
more famous shapes and some pictures to illustrate
the de®nitions.
1. A matrix is square if m  n.
a;
a
11
a
12
a
21
a
22

;
a
11
a
12
a
13
a
21
a
22
a
23
a

31
a
32
a
33
H
d
I
e
; FFF
size  1  1; 2  2; 3  3; FFF 2a
2. An m  n matrix is called a column vctor if
n  1.
a;
a
b

; FFF
a
1
F
F
F
a
n
H
f
d
I
g

e
size  1  1; 2  1; FFF; n  1 2b
3. An m n matrix is called a row vector if m  1.
a; a; b; FFF; a
1
; FFF; a
n

size  1  1; 1  2; FFF; 1  n 2c
4. An m  n matrix A is upper triangular if A
ij

0 whenever i > j.
43
Copyright © 2000 Marcel Dekker, Inc.
A 
a
11
a
12
a
13
FFF a
1m
FFF a
1n
0 a
22
a
23

FFF a
2m
FFF a
2n
00a
33
FFF a
3m
FFF a
3n
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
000FFF a
mm
FFF a
mn
H

f
f
f
f
f
f
f
d
I
g
g
g
g
g
g
g
e
if m  n
2d
5. An m  n matrix A is lower triangular if A
ij
 0 whenever i < j.
A 
a
11
00FFF 00FFF 0
a
21
a
22

0 FFF 00FFF 0
a
31
a
32
a
33
FFF 00FFF 0
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
a
m1
a
m2

a
m3
FFF a
mm
0 FFF 0
H
f
f
f
f
f
f
f
d
I
g
g
g
g
g
g
g
e
if m  n
2e
6. An m  n matrix A is diagonal if A
ij
 0 when-
ever i  j.
a

11
0 FFF 0
0 a
22
FFF 0
F
F
F
F
F
F
F
F
F
00FFF a
nn
H
f
f
f
d
I
g
g
g
e
if m  n 2f
7. A square matrix is symmetric (skew-symmetric)
if A
ij

A
ji
A
ji
 for all i; j  1; FFF; n.
a;
ab
bc

;
abc
bde
cef
H
d
I
e
; FFF symmetric
2g
size  1  1; 2  2; 3  3
0;
0 b
b 0

;
0 bc
b 0 e
c e 0
H
f

d
I
g
e
; FFF
skew-symmetric
2h
De®nition 3. A submatrix of A is a matrix obtained
from A by deleting certain rows and/or columns of A.
A partition of A is a series of horizontal and vertical lines
drawn in A which divide A into various submatrices.
Example 1. Suppose A 
xy
zw

. Then
x; y; z; w;
x
z

;
y
w

; x; y; z; w;
xy
zw

3
is a complete list of the submatrices of A.

xy
zw

;
xy
zw

;
xz
yw

;
xy
zw

4
are all partitions of A.
The most important partitions of a matrix A are its
column and row partitions.
De®nition 4. Let
A 
a
11
; FFF; a
1n
F
F
F
F
F

F
a
m1
; FFF; a
mn
H
f
d
I
g
e
 M
mn
F
1. For each j  1; FFF; n, the m  1 submatrix
Col
j
A
a
1j
F
F
F
a
mj
H
f
d
I
g

e
of A is called the jth column of A.
2. A Col
1
ACol
2
AFFF Col
n
A is called
the column partition of A.
3. For each i  1; FFF; m, the 1  n submatrix
Row
i
Aa
i1
; FFF; a
in
 of A is called the ith
row of A.
4.
A 
Row
1
A
F
F
F
Row
m
A

H
f
d
I
g
e
is called the row partition of A.
We will cut down on the amount of space required
to show a column or row partition by employing the
following notation. In De®nition 4, let 
j
 Col
j
A for
j  1; FFF; n and let 
i
 Row
i
A for i  1; FFF; m.
Then the column partition of A will be written A 

1
 
2
 FFF 
n
 and the row partition of A will be
written A 
1
Y 

2
YFFFY
m
.
3.1.2 Matrix Arithmetic
De®nition 5. Two matrices A and B with entries from
F are said to be equal if size Asize B and A
ij

B
ij
for all i  1; FFF; m; j  1; FFF; n. Here
m  n  size A.
44 Brown
Copyright © 2000 Marcel Dekker, Inc.
If A and B are equal, we will write A  B. Notice
that two matrices which are equal have the same size.
Thus, the 1 1 matrix (0) is not equal to the 1  2
matrix (0,0). Matrix addition, scalar multiplication,
and multiplication of matrices are de®ned as follows.
De®nition 6
1. Let A, B  M
mn
F. Then A  B is the m  n
matrix whose i,jth entry is given by A B
ij

A
ij
B

ij
for all i  1; FFF; m and j  1; FFF; n.
2. If A  M
mn
F and x  F, then xA is the m  n
matrix whose i,jth entry is given by xA
ij

xA
ij
for all i  1; FFF; m and j  1; FFF; n.
3. Let A  M
mn
F and C  M
np
F. Then AC is
the m  p matrix whose i,jth entry is given by
AC
ij


n
k1
A
ik
C
kj
for i  1; FFF; mY j  1; FFF; p
Example 2. Let
A 

103
112

B 
101
102

 M
23
Q
and let
C 
02
11
40
H
d
I
e
 M
32
Q
Then
A  B 
204
214

6A 
6018
6612


AC 
12 2
73

5
Notice that addition is de®ned only for matrices of
the same size. Multiplication is de®ned only when the
number of columns of the ®rst matrix is equal to the
number of rows of the second matrix.
The rules for matrix addition and scalar multiplica-
tion are summarized in the following theorem:
Theorem 1. Let A, B, C  M
mn
F. Let x, y  F.
Then
1. A  B  B  A:
2. A  BC  A B  C:
3. A  0  A:
4. A 1A  0:
5. xyA  xyA:
6. xA  BxA  xB:
7. x  yA  xA  yA:
8. 1A  A:
When no explicit reference is given, a proof of the
quoted theorem can be found in Brown [1]. The num-
ber zero appearing in 3 and 4 above denotes the m n
matrix all of whose entries are zero. The eight state-
ments given in Theorem 1 imply M
mn

F is a vector
space over F (see de®nition 16 in Sec. 3.2) when vector
addition and scalar multiplication are given as in 1 and
2 of De®nition 6.
Theorem 1(2) implies matrix addition is associative.
It follows from this statement that expressions of the
form x
1
A
1
x
r
A
r
A
i
 M
mn
F and x
i
 F can
be used unambiguously. Any placement of parentheses
in this expression will result in the same answer. The
sum x
1
A
1
x
r
A

r
is called a linear combination of
A
1
; FFF; A
r
. The numbers x
1
; FFF; x
r
are called the sca-
lars of the linear combination.
Example 3. Let
A 
12
12

B 
10
02

C 
01
11

We view A, B, C  M
22
Z
3
. Then

2A  B  2C 
00
12

6
The rules for matrix multiplication are as follows:
Theorem 2. Let A, D  M
mn
F, B  M
np
F, C 
M
pq
F and E  M
rm
F. Let x  F. Then
1. ABC  ABC:
2. A  DB  AB  DB:
3. EA  DEA  ED:
4. 0A  0 and A0  0:
5. I
m
A  A and AI
n
 A:
6. xABxAB  AxB:
In Theorem 2(4), the zero denotes the zero matrix of
various sizes. In Theorem 2(5), I
n
denotes the n  n

identity matrix. This is the diagonal matrix given by
I
n

jj
 1 for all j  1; FFF; n. Theorem 2 implies M
nn
F is an associative algebra with identity [2, p. 36] over
the ®eld F.
Consider the following system of m equations in
unknowns x
1
; FFF; x
n
X
Linear Algebra 45
Copyright © 2000 Marcel Dekker, Inc.
a
11
x
1
 a
12
x
2
a
1n
x
n
 b

1
F
F
F
F
F
F
a
m1
x
1
 a
m2
x
2
a
mn
x
n
 b
m
7
In Eq. (7), the a
ij
's and b
i
's are constants in F. Set
A 
a
11

; FFF; a
1n
F
F
F
F
F
F
a
m1
; FFF; a
mn
H
f
f
d
I
g
g
e
B 
b
1
F
F
F
b
m
H
f

f
d
I
g
g
e
X 
x
1
F
F
F
x
n
H
f
f
d
I
g
g
e
8
Using matrix multiplication, the system of linear
equations in Eq. (7) can be written succinctly as
AX  B 9
We will let F
n
denote the set of all column vectors of
size n. Thus, F

n
 M
n1
F. A column vector   F
n
is
called a solution to Eq. (9) if A  B. The m  n matrix
A a
ij
M
mn
F is called the coef®cient matrix of
Eq. (7). The partitioned matrix A  BM
mn1
F
is called the augmented matrix of Eq. (7). Matrix mul-
tiplication was invented to handle linear substitutions
of variables in Eq. (7). Suppose y
1
; FFF; y
p
are new vari-
ables which are related to x
1
; FFF; x
n
by the following
set of linear equations:
x
1

 c
11
y
1
c
1p
y
p
F
F
F
x
n
 c
n1
y
1
c
np
y
p
(here c
uv
 F for all u; v
10
Set
C 
c
11
; FFF; c

1p
F
F
F
F
F
F
c
n1
; FFF; c
np
H
f
d
I
g
e
 M
np
F
Substituting the expressions in Eq. (10) into Eq. (7)
produces m equations in y
1
; FFF; y
p
. The coef®cient
matrix of the new system is AC, the matrix product
of A and C.
De®nition 7. A square matrix A  M
nn

F is said to
be invertible (or nonsingular) if there exists a square
matrix B  M
nn
F such that AB  BA  I
n
.
If A  M
nn
F is invertible and AB  BA  I
n
for
some B  M
nn
F, then B is unique and will be
denoted by A
1
. A
1
is called the inverse of A.
Example 4. Let
A 
xy
zw

 M
22
F
and assume Á  xw  yz  0. Then A is invertible
with inverse given by

A
1

w=Á y=Á
z=Á x=Á

11
If m  n in Eq. (7) and the coef®cient matrix A is
invertible, then AX  B has the unique solution A
1
b.
De®nition 8. Let A  M
mn
F. The transpose of A is
denoted by A
t
. A
t
is the n  m matrix whose entries are
given by A
t

ij
A
ji
for all i  1; FFF; n and
j  1; FFF; m.
A square matrix is symmetric (skew-symmetric) if
and only if A  A
t

A
t
. When the ®eld F  C, the
complex numbers, the Hermitian conjugate (or conju-
gate transpose) of A is more useful than the transpose.
De®nition 9. Let A  M
mn
C. The Hermitian conju-
gate of A is denoted by A

. A

is the n m matrix whose
entries are given by A


ij

"
A
ji
for all i  1; FFF; n and
j  1; FFF; m.
In De®nition 9, the bar over A
ji
denotes the con-
jugate of the complex number A
ji
. For example,
1  i 2

2  ii



1  i 2  i
2 i

and
1  i 2
2  ii

t

1  i 2  i
2 i

12
The following facts about transposes and Hermitian
conjugates are easy to prove.
Theorem 3. Let A, C  M
mn
F and B  M
np
F.
Then
1. A  C
t
 A
t
 C

t
:
2. AB
t
 B
t
A
t
:
3. A
t

t
 A:
4. If m  n and A is invertible, then A
t
is also inver-
tible. In this case, A
t

1
A
1

t
.
If F  C, then we also have
5. A  C

 A


 C

:
6. AB

 B

A

:
7. A



 A:
8. If A is invertible, so is A

and A


1
A
1


.
46 Brown
Copyright © 2000 Marcel Dekker, Inc.
3.1.3 Block Multiplication of Matrices

Theorem 4. Let A  M
mn
F and B  M
np
F.
Suppose
A 
A
11
FFF A
1k
F
F
F
F
F
F
A
r1
FFF A
rk
H
f
f
d
I
g
g
e
and

B 
B
11
FFF B
1t
F
F
F
F
F
F
B
k1
FFF B
kt
H
f
f
d
I
g
g
e
are partitions of A and B such that size A
ij
m
i
 n
j
and size B

jl
n
j
 p
l
. Thus, m
1
m
r
 m,
n
1
n
k
 n, and p
1
p
t
 p. For each i 
1; FFF; r and j  1; FFF; t, set
C
ij


k
q1
A
iq
B
qj

(multiplication of blocks)
Then
AB 
C
11
FFF C
1t
F
F
F
F
F
F
C
r1
FFF C
rk
H
f
d
I
g
e
Notice that the only hypothesis in Theorem 4 is that
every vertical line drawn in A must be matched with
the corresponding horizontal line in B. There are four
special cases of Theorem 4 which are very useful. We
collect these in the next theorem.
Theorem 5. Let A  M
mn

F.
1. If  x
1
; FFF; x
n

t
 F
n
, then A 

n
i1
x
i
Col
i
A.
2. If B 
1
 FFF 
p
M
np
F, then AB A
1
 FFF A
p
:
3. If  y

1
; FFF; y
m
M
1m
F, then A 

m
i1
y
i
Row
i
A.
4. If C 
1
YFFFY
r
M
rm
F, CA 
1
AYFFFY

r
; A:
De®nition 10. Let A  M
mn
F.
1. CSAA    F

n
 is called the column space
of A.
2. RSAA    M
1m
F is called the row
space of A.
Theorem 5 implies that the column space of A con-
sists of all linear combinations of the columns of A.
RSA is all linear combinations of the rows of A.
Using all four parts of Theorem 5, we have
CSABCSA
RSABRSB
13
The column space of A is particularly important for
the theory of linear equations. Suppose A  M
mn
F
and B  F
m
. Theorem 5 implies that AX  B has a
solution if and only if B  CSA:
3.1.4 Gaussian Elimination
The three elementary row operations that can be per-
formed on a given matrix A are as follows:
 Interchange two rows of A
 Add a scalar times one row of A to another
row of A
 Multiply a row of A by a nonzero scalar
14

There are three corresponding elementary column
operations which can be preformed on A as well.
De®nition 11. Let A
1
; A
2
 M
mn
F. A
1
and A
2
are
said to be row (column) equivalent if A
2
can be obtained
from A
1
by applying ®nitely many elementary row (col-
umn) operations to A
1
:
If A
1
and A
2
are row (column) equivalent, we will
write A
1
~

rA
2
A
1
~
cA
2
. Either one of these relations is
an equivalence relation on M
mn
F. By this, we mean
A
1
~
rA
1
(
~
r is re¯exive)
A
1
~
rA
2
 A
2
~
rA
1
(

~
r is symmetric)
A
1
~
rA
2
; A
2
~
rA
3
 A
1
~
rA
3
(
~
r is transitive)
(15)
Theorem 6. Let A, C  M
mn
F and B, D  F
m
.
Suppose A  B
~
r C  D. Then the two linear systems
of equations AX  B and CX  D have precisely the

same solutions.
Gaussian elimination is a strategy for solving a sys-
tem of linear equations. To ®nd all solutions to the
linear system of equations
a
11
x
1
 a
12
x
2
a
1n
x
n
 b
1
F
F
F
a
m1
x
1
 a
m2
x
2
a

mn
x
n
 b
m
16
carry out the following three steps.
Linear Algebra 47
Copyright © 2000 Marcel Dekker, Inc.
1. Set up the augmented matrix of Eq. (16):
A  B
a
11
; FFF; a
1n
F
F
F
F
F
F
a
m1
; FFF; a
mn
b
1
F
F
F

b
m







I
g
e
H
f
d
2. Apply elementary row operations to A  B to
obtain a matrix C  D in upper triangular
form.
3. Solve CX  D by back substitution.
By Theorem 6, this algorithm yields a complete set
of solutions to AX  B.
Example 5. Solve
2x  3y  4z  10
x  y  z  2 F  Q
x  z  3

Following steps 1±3, we have
1.
234
1 1 1

101
10
2
3







H
f
d
I
g
e
is the augmented matrix of 
2.
234
1 1 1
101
10
2
3








H
f
d
I
g
e


101
1 1 1
234
3
2
10







H
f
d
I
g
e



101
0 1 2
032
3
1
4







H
f
d
I
g
e


101
012
032
3
1
4








H
f
d
I
g
e


10 1
01 2
004
3
1
1







H
f
d
I

g
e
(upper triangular).
The letters below the arrows indicate which
type of elementary row operation was used on
the matrix on the left to get the matrix on the
right.
3. Solve
x  z  3 x  13=4
y  2z  1  y  3=2
4z  1 z 1=4
Thus, x  13=4, y  3=2 and z 1=4 is the
(unique) solution to .
De®nition 12. Let A  M
mn
F. A system of equa-
tions of the form AX  0 is called a homogeneous sys-
tem of equations. A nonzero, column vector   F
n
is
called a nontrivial solution of AX  0 if A  0.
Using Gaussian elimination, we can prove the fol-
lowing theorem.
Theorem 7. Let A  M
mn
F.
1. The homogeneous system of equations AX  0
has a nontrivial solution if m < n.
2. Suppose m  n. The homogeneous system of
equations AX  0 has only X  0 as a solution

if and only if A
~
rI
n
.
3.1.5 Elementary Matrices and Inverses
De®nition 13. An elementary matrix (of size m m)
is a matrix obtained from I
m
by performing a single
elementary row operation on I
m
.
Pictures of the three types of elementary matrices
are as follows:
1. E
ij
will denote the matrix obtained from I
m
by
interchanging rows i and j. These matrices are
called transpositions.
E
ij

1
F
F
F
1

0 FFF FFF 01
F
F
F
1
F
F
F
F
F
F
F
F
F
01
F
F
F
10
F
F
F
F
F
F
0
1
F
F
F

1
P
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
R
Q
U
U
U
U
U
U

U
U
U
U
U
U
U
U
U
U
U
U
U
U
S
i
j
17a
ij
2. E
ij
c will denote the matrix obtained from I
m
by adding c times row j of I
m
to row i of I
m
.
48 Brown
Copyright © 2000 Marcel Dekker, Inc.

E
ij
c
10FFF FFF FFF FFF FFF FFF FFF FFF 0
F
F
F
F
F
F
0 FFF 010FFF 0 c 0 FFF 0
F
F
F
F
F
F
0 FFF FFF FFF FFF FFF FFF 10FFF 0
F
F
F
F
F
F
0 FFF FFF FFF FFF FFF FFF FFF FFF 01
P
T
T
T
T

T
T
T
T
T
T
T
T
T
R
Q
U
U
U
U
U
U
U
U
U
U
U
U
U
S
i
j
(here i < j
17b
j

3. E
i
C will denote the elementary matrix
obtained from I
m
by multiplying the ith row
of I
m
by c  0:
E
i
c
10FFF FFF FFF 0
F
F
F
F
F
F
0 FFF c FFF 0
F
F
F
F
F
F
0 FFF FFF FFF 01
P
T
T

T
T
T
R
Q
U
U
U
U
U
S
i 17c
Each elementary matrix is invertible. Multiplying on
the left by elementary matrices performs row opera-
tions on A.
Theorem 8
1. E
1
ij
 E
ij
.
2. E
ij
c
1
 E
ij
c.
3. E

i
c
1
 E
i
1=c.
4. For any A  M
mn
F,
a. E
ij
A is the m  n matrix obtained from A by
interchanging rows i and j of A.
b. E
ij
cA is the m  n matrix obtained from A
by adding c times row j of A to row i of A.
c. E
i
cA is the m  n matrix obtained from A
by multiplying row i of A by c.
Thus, two m  n matrices A and B are row equiv-
alent if and only if there exist a ®nite number of
elementary matrices E
1
; FFF; E
k
such that
E
k

E
k1
E
2
E
1
A  B.
Example 6. In Example 5, we showed that
A  B
234
1 1 1
101
10
2
3







H
f
d
I
g
e
~
r

10 1
01 2
004
3
1
1







H
f
d
I
g
e
C  D:
The sequence of elementary matrices used there are as
follows:
E
32
3E
2
1E
31
2E
21

1E
13
A  BC  D
We can also multiply a given matrix A on the right
by elementary matrices. Multiplying on the right per-
forms elementary column operations on A.
Theorem 9. Let A  M
mn
F be a nonzero matrix.
Then there exist elementary matrices E
1
; FFF; E
k
and
E

1
; FFF; E

`
such that
E
k
E
2
E
1
AE

1

E

2
E

`

I
t
0




0
0

18
The positive integer t appearing in Theorem 9 is
called the rank of A.
Elementary matrices can be used to characterize
invertible matrices.
Theorem 10. Let A  M
nn
F. Then the following
conditions are equivalent:
1. A is invertible.
2. A has a left inverse, i.e., BA  I
n
for some

B  M
nn
F.
3. A has a right inverse, i.e., AC  I
n
for some
C  M
nn
F.
4. The homogeneous system of equations AX  0
has no nontrivial solution.
5. A is a ®nite product of elementary matrices.
The proof of Theorem 10 incorporates an algorithm
for computing A
1
, which is effective if n is small and
the entries of A are reasonable:
Suppose A is invertible.
1. Form the n  2n partitioned matrix A  I
n
.
2. Apply row operations to A  I
n
 so that A on
the left in A  I
n
 is changed to I
n
. Then I
n

on
the right will change to A
1
.
In symbols,
A  I
n

~
r I
n
 A
1
19
See Brown [1; Ex. 5.15, Chap. I] for a concrete exam-
ple.
One nice application of this material involves the
Vandermonde matrix:
V 
1 a
0
a
2
0
FFF a
n
0
1 a
1
a

2
1
a
n
1
F
F
F
F
F
F
F
F
F
F
F
F
1 a
n
a
2
n
a
n
n
H
f
f
f
d

I
g
g
g
e
 M
n1n1
R
20
Linear Algebra 49
Copyright © 2000 Marcel Dekker, Inc.
This matrix is invertible if a
0
; a
1
; FFF; a
n
are distinct real
numbers. V is used to prove the following interpola-
tion theorem:
Theorem 11. Suppose a
0
; FFF; a
n
are n  1 distinct real
numbers. Let b
0
; FFF; b
n
 R. There exists a polynomial

(with real coef®cients) p(t) such that the degree of p(t)
is at most n and pa
i
b
i
for all i  0; 1; FFF; n.
3.1.6 LU -Factorizations
LU-Factorizations are re®nements of Gaussian elimi-
nation.
De®nition 14. Let A  M
nn
F. A has an LU-factor-
ization if there exists a unit, lower triangular matrix
L 
10FFF 0
`
21
1
F
F
F
F
F
F
F
F
F
F
F
F

F
F
F
`
n1
`
n2
FFF 1
H
f
f
f
d
I
g
g
g
e
 M
nn
F
and an upper triangular matrix
U 
u
11
u
12
FFF u
1n
0 u

22
FFF u
2n
F
F
F
F
F
F
F
F
F
00FFF0 u
nn
H
f
f
f
d
I
g
g
g
e
 M
nn
F
such that A  LU.
A given matrix A  M
nn

F may not have an LU-
factorization. The deciding factor is whether any trans-
positions are needed to row reduce A to an upper tri-
angular matrix.
De®nition 15. A square matrix of the form
L 
1
F
F
F
1
c
i1
1
F
F
F
F
F
F
c
n
FFF FFF 1
H
f
f
f
f
f
f

f
d
I
g
g
g
g
g
g
g
e
nn
is called a Frobenius matrix (L has ones on its diagonal,
constants c
i1
; FFF; c
n
in positions i 1; iFFFn; i and
zeros elsewhere).
Frobenius matrices are invertible. The inverse of L
is obtained from L by changing the signs of
c
i1
; FFF; c
n
.IfR
i
denotes the ith row of A and L is
the Frobenius matrix pictured above, then
LA 

R
1
F
F
F
R
i
R
i1
 c
i1
R
i
F
F
F
R
n
 c
n
R
i
H
f
f
f
f
f
f
f

f
d
I
g
g
g
g
g
g
g
g
e
21
Thus, multiplying A by L on the left performs elemen-
tary row operations of type ()onA below row i.
Hence, the process which row reduces A to an upper
triangular matrix can be stated as follows:
Theorem 12. Let A  M
nn
F. There exist a ®nite
number of Frobenius matrices L
1
; FFF; L
k
and a ®nite
number of transpositions E
1
; FFF; E
k
such that L

k
E
k

L
1
E
1
A  U an upper triangular matrix.
Example 7. Suppose
A 
012
121
201
H
d
I
e
 M
33
Q
Then
100
010
041
H
f
d
I
g

e
|{z}
L
2
100
010
001
H
f
d
I
g
e
|{z}
E
2
100
010
201
H
f
d
I
g
e
|{z}
L
1
010
100

001
H
f
d
I
g
e
|{z}
E
1
012
121
201
H
f
d
I
g
e
|{z}
A

121
012
007
H
f
d
I
g

e
|{z}
U
22
If no transpositions are required in Theorem 12, i.e.,
E
i
 I
n
for all i  1; FFF; k, then L
k
L
k1
FFFL
1
A  U.
Consequently, A L
1
1
L
1
2
FFF L
1
k
U. It is easy to
see that L  L
1
1
L

1
2
FFFL
1
k
is a unit, lower triangular
matrix and hence A has an LU-factorization. This
proves part of our next theorem.
Theorem 13. Let A  M
nn
F.
1. A has an LU-factorization if and only if no row
interchanges are required in the Gaussian reduc-
tion of A to upper triangular form.
2. Suppose A has an LU-factorization. If A is inver-
tible, then the factorization is unique.
3. For any A  M
nn
F, there exists a permutation
matrix P, i.e., a ®nite product of transposition,
such that PA has an LU-factorization.
50 Brown
Copyright © 2000 Marcel Dekker, Inc.
There are several important applications of LU-fac-
torizations. We will give one application now and
another in the section on determinants. LU-
Factorizations are used to solve systems of equations.
To solve AX  B:
1. Find an LU-factorization PA  LU.
2. Replace AX  B with PAX  PB, i.e.,

LUXPB.
3. Solve LY  PB by forward substitution.
4. Solve UX  Y by back substitution. (23)
Thus, replacing A by an LU-factorization converts
AX  B into two simpler problems LY  PB and
UX  Y. These last two systems are lower triangular
and upper triangular respectively. These systems are
usually easier to solve than the original.
3.2 VECTOR SPACES
3.2.1 De®nitions and Examples
De®nition 16. A vector space over F is a nonempty set
V together with two functions ;   from V 
V  V and x;x from F  V  V which
satisfy the following conditions:
V1.        for all ;   V.
V2.       for all ; ;   V.
V3. There exists an element 0  V such that   0
  for all   V.
V4. For any   V, there exists a   V such that
    0.
V5. xy  xy for all   V and x; y  F.
V6. x  x  x for all ;   V and
x  F.
V7. x  y  x  y for all   V and x; y  F.
V8. 1   for all   V.
Suppose V; ;   ; x;x is a vector
space over F. The elements in V are called vectors and
will usually be denoted by Greek letters. The elements
in F are called scalars and will be represented by low-
ercase English letters. The function ;    is

called vector addition. The function x;x is
called scalar multiplication. Notice that a vector
space is actually an ordered triple consisting of a set
of vectors, the function vector addition and the func-
tion scalar multiplication. It is possible that a given set
V can be made into a vector space over F in many
different ways by specifying different vector additions
or scalar multiplications on V. Thus, when de®ning a
vector space, all three pieces of information (the vec-
tors, vector addition and scalar multiplication) must be
given.
Suppose V; ;   ; x;x is a vector
space over F . When the two functions ;   ;
x;x are understood from the context or when
it is not important to know the exact forms of these
functions, we will drop them from our notation and
simply call the vector space V. Axioms V1 and V2 say
that vector addition is commutative and associative.
There is only one vector 0  V which satis®es V3.
This vector is called the zero vector of V.If  V,
there is only one vector   V such that     0.
The vector  is called the inverse of  and written
. The following facts about addition and scalar mul-
tiplication are true in any vector space.
Theorem 14. Let V be a vector space over F. Then
1. Any parentheses placed in 
1

n
result in

the same vector.
2. x0  0 for all x  F.
3. 0  0 for all   V.
4. 1  for all   V.
5. If x  0 then, x  0 or   0.
The reader will notice that we use 0 to represent the
zero vector in V as well as the zero scalar in F. This
will cause no real confusion in what follows. Theorem
14(1) implies linear combinations of vectors in V, i.e.,
sums of the form x
1

1
x
n

n
, can be written
unambiguously with no parentheses.
The notation for various vector spaces students
encounter when studying linear algebra is becoming
standard throughout most modern textbooks. Here is
a short list of some of the more important vector
spaces. If the reader is in doubt as to what addition
or scalar multiplication is in the given example, consult
Brown [1, 2].
1. F
S
 all functions from a set S to the ®eld F.
2. M

mn
Fthe set of all m n matrices with
entries from F.
3. FXthe set of all polynomials in X with
coef®cients from F.
4. CSA; RSA, and NSA  F
n
 A  0
for any A  M
mn
F.(NSA is called the null
space of A.)
5. C
k
If  R
I
 f is k times differentiable on
I.(I here is usually some open or closed set
contained in R).
6. a; b  f  R
a;b
 f is Riemann integrable
on a; b. (24)
Linear Algebra 51
Copyright © 2000 Marcel Dekker, Inc.
De®nition 17. Let W be a nonempty subset of a vector
space V. W is a subspace of V if     W and x 
W for all ;   W and x  F.
Thus, a subset is a subspace if it is closed under
vector addition and scalar multiplication. RX; C

k
a;
b and a; b are all subspaces of R
a;b
.If
A  M
mn
F, then NSA is a subspace of F
n
, CSA
is a subspace of F
m
and RSA is a subspace of
M
1n
F. One of the most important sources of sub-
spaces are linear spans.
De®nition 18. Let S be a subset of a vector space V.
The set of all linear combinations of vectors from S is
called the linear span of S. We will let LS denote the
linear span of S.
If S , i.e., S is empty, then we set LS0.
Notice,   LS if   x
1

1
x
n

n

for some 
1
;
FFF;
n
 S and x
1
; FFF; x
n
 F.IfS is ®nite, say
S 
1
; FFF;
r
, then we often write L
1
; FFF;
r
 for
LS. For example, if A 
1
FFF
n
M
mn
F, then
L
1
; FFF;
n

CSA.
Theorem 15. Let V be a vector space over F.
1. For any subset S  V, LS is a subspace of V.
2. If S
1
 S
2
 V, then LS
1
LS
2
V.
3. If   LS, then   LS
1
 for some ®nite subset
S
1
 S.
4. LLS  LS.
5. Exchange principle: If   LS  and
= LS, then   LS .
The exchange principle is used to argue any two
bases of V have the same cardinality.
De®nition 19. A vector space V is ®nite dimensional if
V  LS for some ®nite subset S of V.
If V is not ®nite dimensional, we say V is in®nite
dimensional. M
mn
F, CSA, RSA and NSA are
all examples of ®nite-dimensional vector spaces over F.

RX, C
k
0; 1 and 0; 1 are all in®nite-dimen-
sional vector spaces over R:
3.2.2 Bases
De®nition 20. Let S be a subset of a vector space V.
1. The set S is linearly dependent (over F) if there
exist distinct vectors 
1
; FFF;
n
 S and nonzero
scalars x
1
; FFF; x
n
 F such that x
1

1

x
n

n
 0.
2. The set S is linearly independent (over F) if S is
not linearly dependent.
3. S is a basis of V if S is linearly independent and
LSV:

Suppose S is a basis of V. Then every vector in V is
a linear combination of vectors from S. To be more
precise, if   V and   0, then there exist 
1
; FFF;
n
 S (all distinct) and nonzero scalars x
1
; FFF; x
n
 F
such that   x
1
 x
n

n
. Furthermore, this
representation is unique. By this, we mean if   y
1

1
y
t

t
with y
1
; FFF; y
t

nonzero scalars and 
1
;
FFF;
t
distinct vectors in S, then n  t and after a suit-
able permutation of the symbols, 
1
 
1
; FFF;
n
 
n
and x
1
 y
1
; FFF; x
n
 y
n
.
The four basic theorems about bases are listed in
our next theorem.
Theorem 16
1. Every vector space has a basis.
2. If S is a linearly independent subset of V, then S
 B for some basis B of V.
3. If LSV, then S contains a basis of V.

4. Any two bases of V have the some cardinality.
A proof of Theorem 16 can be found in Brown [2].
A much simpler proof of Theorem 16 when V is ®nite
dimensional can be found in Brown [1]. The common
cardinality of the bases of V will be denoted by dimV
and called the dimension of V: DimV is a ®nite car-
dinal number, i.e., 0; 1; 2; FFF if and only if V is ®nite
dimensional. If V is in®nite dimensional, then dimV
is an in®nite cardinal number. In this case, we will
simply write dimV. In our next example, we
list the dimensions of some of the more important
®nite-dimensional vector spaces.
Example 8
1. dimM
mn
F  mn: A basis of M
mn
F is
given by the matrix units B A
ij
 i  1; FFF;
mY j  1; FFF; n of M
mn
F. Here A
ij
is the m 
n matrix having a 1 in its i,jth entry and 0 else-
where.
2. dimF
n

n. A basis of F
n
is given by B "
1
;
FFF;"
n
 where I
n
"
1
 FFF "
n
. B is usually
called the canonical basis of F
n
.
3. Let 
n
RpXRXdegreepn.
Then dim
n
R  n  1. B 1; X; FFF; X
n

is a basis of 
n
R:
52 Brown
Copyright © 2000 Marcel Dekker, Inc.

4. Let A  M
mn
F. The dimension of CSA is
called the rank of A and written rkA. The
dimension of NSA is called the nullity of A
and written A. It follows from Theorem 17
that 0  rkAm and 0  An.
Theorem 17. Suppose W is a subspace of V and
dimV < . Then dimWdimV with equality if
and only if W  V.
There are many standard theorems about rkA, i.e.,
dimCSA. Here are some of the more important
ones.
Theorem 18. Let A  M
mn
F and B  M
np
F.
1. dimCSA  dimRSA.
2. 0  rkAminm; n:
3. rkArkA
t
rkA

 if F  C.
4. rkArkPAQ for any invertible matrices P,
Q.
5. rkABminrkA; rkB:
6. rkAAn.
Theorem 18(1) implies that the rank of A can be

computed from the row space of A as well as the col-
umn space of A. Theorem 18(4) implies that the integer
t appearing in Theorem 9 is the rank of A. Theorem
18(6) is called Sylvester's law of nullity. We also have
Theorem 19. Let A  M
nn
F. Then the following
statements are equivalent:
1. A is invertible.
2. rkAn.
3. The columns of A are linearly independent.
4. The rows of A are linearly independent.
5. CSAF
n
:
6. RSAM
1n
F.
7. A0.
For systems of linear equations, we have:
Theorem 20. Let A  M
mn
F and B  F
m
. The lin-
ear system of equations AX  B has a solution if and
only if rkA  BrkA.
3.2.3 Coordinate Maps and Change-of-Basis
Matrices
One of the most important applications of bases is to

convert abstract problems into concrete matrix pro-
blems which machines can then solve. To see how
this is done, suppose V is a ®nite-dimensional vector
space over F. Suppose dimVn. Choose a basis

1
; FFF;
n
 of V and form the n-tuple
B 
1
; FFF;
n
. Then B determines a function 
B
X
V  F
n
which is de®ned as follows:
De®nition 21

B

x
1
F
F
F
x
n

H
f
d
I
g
e
if   x
1

1
x
n

n
in V
The de®nition makes sense because the entries in B
form a basis of V. Every vector   V can be written in
one and only one way as a linear combination of

1
; FFF;
n
. The function 
B
is called a coordinate
map on V. If we permute the entries of B or choose
a new basis altogether, we get a different coordinate
map. Every coordinate map satis®es the following con-
ditions:
Theorem 21. Let 

1
; FFF;
n
 be a basis of V and set
B 
1
; FFF;
n
. Then,
1. x  y
B
 x
B
 y
B
for all ;   V and x,
y  F.
2. 
B
X V  F
n
is one-to-one and onto.
3. 
1
; FFF;
t
are linearly independent in V if and
only if 
1


B
; FFF; 
t

B
are linearly independent
in F
n
.
4.   L
1
; FFF;
t
 in V if and only if 
B

L
1

B
; FFF; 
t

B
 in F
n
.
5. If W
i
; i  1; 2; 3, are subspaces of V, then W

1

W
2
 W
3
W
1
 W
2
 W
3
 if and only if W
1

B
W
2

B
W
3

B
W
1

B
W
2


B
W
3

B
 in F
n
.
Let us give one example which illustrates the power
of Theorem 21.
Example 9. Let V 
4
R. Suppose f
1
X
 1  X  X
4
, f
2
X1  X  X
3
 X
4
and f
3
X
 1  3X  X
3
 3X

4
. Are f
1
; f
2
; f
3
linearly indepen-
dent in V? To answer this question, we use Theorem
21(3).
Let B 1; X; X
2
; X
3
; X
4
. B is an ordered basis of
V. The matrix
Linear Algebra 53
Copyright © 2000 Marcel Dekker, Inc.
A f
1

B
f
2

B
f
3


B


111
1 13
000
011
1 13
H
f
f
f
f
f
f
d
I
g
g
g
g
g
g
e
~
r
11 1
011
00 0

00 0
00 0
H
f
f
f
f
f
f
d
I
g
g
g
g
g
g
e
has rank 2. Thus f
1
; f
2
; f
3
are linearly dependent.
Suppose B 
1
; FFF;
n
 and C 

1
; FFF;
n
 are
two ordered bases of V. How are the coordinate
maps 
B
and 
C
related?
De®nition 22. Let B 
1
; FFF;
n
 and C 
1
; FFF;

n
 be two ordered bases of V. Set
MC; B
1

B

2

B
FFF
n


B

MC; B is an n  n matrix called the change-of-basis
matrix from C to B.
Theorem 22. Let B 
1
; FFF;
n
 and C 
1
; FFF;

n
 be two ordered basis of V. Then for all   V,
MC; B
C

B
.
Theorem 22 implies each change-of-basis matrix M
C; B is invertible with inverse MB; C:
3.2.4 Linear Transformations
De®nition 23. Let V and W be vector spaces over F. A
function T X V  W is called a linear transformation if
Tx  yxTyT  for all ;   V and
x; y  F.
We will let HomV; W denote the set of all linear
transformations from V to W. In algebra, a linear
transformation is also called a homomorphism. The

symbols denoting the complete set of homomorphisms
from V to W comes from the word homomorphism.
The function T X V  W given by T0 for all  
V is clearly a linear transformation (called the zero
map). Hence, HomV; W contains at least one map.
Here are some standard examples of linear transforma-
tions.
Example 10
1. Coordinate maps 
B
X V  F
n
are linear trans-
formations by Theorem 21(1).
2. Let V  W and de®ne T X V  V by T
for all   V. T is called the identity map on V
and will be denoted by 1
V
.
3. T X M
mn
FM
nm
F given by TAA
t
is
a linear transformation (notice S X M
mn
C
M

nm
C given by SAA

is not a linear
transformation).
4. Let A  M
mn
F. A de®nes a linear transforma-
tion 
A
X F
n
 F
m
given by 
A
A for all
  V. The map 
A
is called multiplication by A.
5. Let I be a nonempty subset of R and set V  R
I
.
Let a  I. The map E
a
X R
I
 R given by E
A
f f a is a linear transformation called eva-

luation at a.
6. Let I be an open interval in R . The map D X C
1
IR
I
given by Df f

(the derivative of
f) is a linear transformation.
7. The map S X a; b  R given by Sf 

b
a
f
tdt is a linear transformation.
De®nition 24. Let T  HomV; W:
1. KerT  VT0:
2. ImTT  V:
3. T is injective (one-to-one, monomorphism) if Ker
T0:
4. T is surjective (onto, epimorphism) if ImT
W:
5. T is an isomorphism if T is both injective and
surjective.
The linear transformations in Example 10(1±3) are
all isomorphisms. 
A
is an isomorphism if and only if
A is invertible. The set KerT is a subspace of V and is
called the kernel of T. The set ImT is a subspace of

W and is called the image (or range) of T.IfT is an
isomorphism, then T is a bijective map (one-to-one
and onto) from V to W. In this case, there is a well
de®ned inverse map T
1
X W  V given by T
1

if T. It is easy to prove T
1
 HomW; V.
De®nition 25. Two vector spaces V and W over F are
said to be isomorphic if there exists a linear transforma-
tion T X V  W which is an isomorphism.
If V and W are isomorphic, we will write V  W.
Our remarks before De®nition 25 imply V  W if and
only if W  V. Isomorphic vector spaces are virtually
identical. Only the names of the vectors are being
changed by the isomorphism. Example 10(1) implies
the following statement: if dimVn, then V  F
n
.
54 Brown
Copyright © 2000 Marcel Dekker, Inc.
Thus, up to isomorphism, F
n
is the only ®nite-dimen-
sional vector space over F of dimension n.
The construction of linear transformations is facili-
tated by the following existence theorem. We do not

assume V is ®nite dimensional here.
Theorem 23. Let V be a vector space over F and sup-
pose B 
i
 i  Á is a basis of V. Let W be another
vector space over F and suppose 
i
 i  Á is any subset
of W. Then
1. A linear transformation from V to W is comp-
letely determined by its values on B. Thus, if T,
S  HomV; W and T
i
S
i
 for all
i  Á, then T  S.
2. There exists a unique linear transformation T X
V  W such that T
i

i
for all i  Á.
A proof of Theorem 23 can be found in Brown [2].
The four basic theorems connecting linear transforma-
tions and dimensions are as follows.
Theorem 24. Let V be a ®nite-dimensional vector space
over F. Let T  HomV; W:
1. If T is surjective, then W is ®nite dimensional. In
this case, dimVdimW:

2. Suppose dimVdimW. Then T is an iso-
morphism if and only if T is injective.
3. Suppose dimVdimW. Then T is an iso-
morphism if and only if T is surjective.
4. dim(KerT  dimImT  dimV:
Finally, HomV; W  is itself a vector space over
F when addition and scalar multiplication of linear
transformations are de®ned as follows: For
T; S  HomV; W, set
T ST S for all   V
xTxT for all x  F and   V
25
3.2.5 Matrix Representations of Linear
Transformations
Suppose V and W are both ®nite-dimensional vector
spaces over F. Say dimVn and dimWm. Let
B 
1
; FFF;
n
 be an ordered basis of V and C 

1
; FFF;
m
 be an ordered basis of W. Let
T  HomV; W. We can then de®ne an m  n matrix
as follows:
De®nition 26. MTY B; CT 
1


C
T
2

C
FFF
T
n

C
:
MTY B; C is called the matrix representation of T
with respect to B and C. The ith column of MTY B; C
is just the coordinates of T 
i
 with respect to C.
Example 11. Let V 
3
R, W 
2
R and D X 
3
R
2
R be ordinary differentiation Df f

.
Let B 1; X; X
2

; X
3
 and C 1; X; X
2
. Then
MDY B; C
0100
0020
0003
H
d
I
e
 M
34
R26
The following diagram is certainly one of the most
important diagrams in linear algebra:
V 
T
W

B

C
F
n


A

F
m
27
Here A  MTY B; C:
Theorem 25. T
C
 A
B
for all   V:
Theorem 25 implies that the diagram (27) commu-
tes, i.e., the composite maps 
C
 T and 
A
 
B
are
the same. The vertical maps in (27) are isomorphisms
which translate the abstract situation V 
T
W into the
concrete situation F
n


A
F
m
. Machines do computa-
tions with the bottom row of (27).

Theorem 26. In diagram (27),
1. KerTNSA:
2. ImTCSA:
The rank and nullity of a linear transformation T are
de®ned to be the dimensions of ImT and KerT
respectively. Let us return to Example 11 for an illus-
tration of how Theorem 26 works. Suppose we want to
compute the rank and nullity of D X 
3
R
2
R.
Since
MDY B; C
0100
0020
0003
H
d
I
e
 M
34
R28
we conclude rkD3 and D1.
If we vary T  HomV; W, we get a function
MY B; C X HomV; WM
mn
F. This map is an
isomorphism of vector spaces.

Linear Algebra 55
Copyright © 2000 Marcel Dekker, Inc.
Theorem 27. MY B; C X HomV; WM
mn
F is
an isomorphism.
Thus,
1. MxT  ySY B; CxMTY B; CyMSY B;
C:
2. MT Y B; C0 if and only if T is the zero map.
3. Given A  M
mn
F there exists a T  HomV;
W such that MTY B; CA. (29)
Suppose V, W , and Z are ®nite dimensional vector
spaces over F.IfT  HomV; W and S 
HomW; Z, then the composite map S T X V  Z
is a linear transformation. If D is an ordered basis of
Z, then
MS  TY B; DMSY C; DMTY B; C30
If T X V  W is an isomorphism, then
MT
1
Y C; BMTY B; C
1
31
Suppose we change bases in V and W. Let B




1
;
FFF;

n
 and C



1
; FFF;

m
 be two new, ordered
bases of V and W respectively. Then we have two
matrix representations of T X MTY B; C and
MTY B

; C

. Recall that MB; B

 and MC; C


denote change-of-basis matrices in V and W. The rela-
tion between the matrix representations of T is as fol-
lows:
MT Y B


; C

MC; C

MT Y B; CMB; B


1
32
Since change of bases matrices are invertible, a simple
translation of Theorem 9 gives us the following theo-
rem:
Theorem 28. Suppose V and W are ®nite-dimensional
vector spaces over F. Let T  HomV; W and suppose
the rank of T is t. Then there exist ordered bases B and
C of V and W respectively such that
MTY B; C
I
t
0





0
0
23
There is a special case of Eq. (32) which is worth
mentioning here. Suppose V  W and

T  HomV; V.IfB and B

are two ordered bases
of V, then MTY B; B and MTY B

; B

 are two n 
n matrices representing T.IfU  MB; B

, then Eq.
(32) becomes
MTY B

; B

UMTY B; BU
1
33
De®nition 27. Let A
1
, A
2
 M
nn
F. A
1
is similar to
A
2

if A
1
 UAU
1
for some invertible matrix
U  M
nn
F.
The relation in Eq. (33) implies that any two matrix
representations of T  HomV; V are similar. We
then have the following questions: What is the simplest
matrix representation of T? In other words, what is the
simplest analog of Theorem 28? In terms of matrices,
the question becomes: What is the simplest matrix B
which is similar to a given matrix A? The answers to
these questions are called canonical forms theorems.
The important canonical forms (e.g., the Jordan cano-
nical form, rational canonical form, etc.) are discussed
in Brown [2].
3.3 DETERMINANTS AND EIGENVALUES
3.3.1 Determinants
Let n  N and set Án1; 2; FFF; n. A permutation
(on n letters) is a bijective function from Án to Án.
We will let S
n
denote the set of all permutations on n
letters. If   S
n
, then  is represented by a 2  n matrix
 

123FFF n
i
1
i
2
i
3
FFF i
n

34
Here 1i
1
, 2i
2
; FFF;ni
n
. Thus, i
1
; FFF; i
n
are the numbers 1; 2; FFF; n in some different order.
Obviously, S
n
is a ®nite set of cardinality n3.
Permutations can be composed as functions on Án.
Composition determines a binary operation S
n
 S
n


S
n
;  which endows S
n
with the structure
of a group. See Brown [1, Chap. III] for more details.
De®nition 28. A permutation   S
n
is called a cycle of
length r if there exist distinct integers i
1
; FFF; i
r
 Án
such that:
1. i
1
i
2
;i
2
i
3
; FFF;i
r1
i
r
, and i
r


 i
1
.
2. jj for all j  Án\i
1
; FFF; i
r
.
If  is a cycle of length r, we will write
 i
1
; i
2
; FFF; i
r
. A two-cycle a; b interchanges a
and b and leaves all other elements of Án invariant.
Two-cycles are also called transpositions. Every per-
mutation in S
n
is a product of disjoint cycles (i.e.,
cycles having no entries in common). Every cycle  
i
1
; FFF; i
r
 is a product of transpositions:
i
1

; FFF; i
r
i
1
; i
r
i
1
; i
r1
FFFi
1
; i
2
. Thus, every per-
mutation is a ®nite product of transpositions.
56 Brown
Copyright © 2000 Marcel Dekker, Inc.
Example 12. Let
 
123456789
234165897

 S
9
Then
 1; 2; 3; 45; 67; 8; 9
1; 41; 31; 25; 67; 97; 8
35
If   S

n
is a product of an even (odd) number of
transpositions, then any factorization of  into a pro-
duct of transpositions must contain an even (odd)
number of terms. A permutation  is said to be even
(odd) if  is a product of an even (odd) number of
transpositions. We can now de®ne a function
sgn X S
n
1; 1 by the following rules:
sgn
1if is even
1if is odd
&
36
The number sgn is called the sign of .Ife
denotes the identity map on Án, then e a; bb; a
and, hence, sgne1. Any transposition is odd.
Hence, sgna; b  1. If  is the permutation given
in Eq. (35), then  is even and sgn1.
We can now de®ne the determinant, detA,ofann
 n matrix A.
De®nition 29. Let A a
ij
M
nn
F. The determi-
nant of A is de®ned to be the following element of F
detA


S
n
sgna
11
a
22
FFFa
nn
37
The symbols in Eq. (37) mean add all possible products
sgna
11
a
22
FFFa
nn
as  ranges over S
n
.Ifwelet
A vary, the determinant de®nes a function
det X M
nn
FF.
The value of detA is numerically hard to compute.
For instance, if n  5, there are 120 products to com-
pute, store, and add together in Eq. (37). Fortunately,
we have many theorems which help us compute the
value of detA. A summary of the more elementary
properties of the determinant is given in Theorem 29
below.

Theorem 29. Let A; B  M
nn
F. Then
1. If Row
i
A0(or Col
i
A0), then
detA0.
2. If Row
i
ARow
j
A for some i  j [or
Col
i
ACol
j
A], then det A  0.
3. If A is upper or lower triangular, then
detA

n
i1
A
ii
.
4. If A 
1
YFFFY

n
 is the row partition of A and

i
    for some i  1; FFF; n and some
;   M
1n
F, then
a. det
1
YFFFYx
i
YFFFY
n

x det
1
YFFFY
i
YFFFY
n
:
b. det
1
YFFFY  YFFFY
n

 det
1
YFFFYYFFFY

n

det
1
YFFFYYFFF
n
:
5. If A 
1
YFFFY
n
 is the row partition of A and
  S
n
, then det
1
YFFFY
n
sgn
det 
1
YFFFY
n
:
6. detABdetAdetB:
7. a. detE
ij
AdetAi  j:
b. detE
ij

cAdetAi  j:
c. detE
j
cAc detA:
8. If PA  LU is an LU-factorization of PA, then
detAsgnP

n
i1
U
ii
:
9. detAdetA
t
:
10. A is invertible if and only if detA  0:
The corresponding statements for columns in
Theorem 29(4±6) are also true. The matrix P in (8) is
a permutation matrix and, consequently, has the form
P "
1
 FFF "
n
 where "
1
 FFF "
n
I
n
and

  S
n
. Then sgnP is de®ned to be sgn. Theorem
29(8) is an important application of LU-factorizations:
To compute detA, factor PA for a suitable permuta-
tion matrix P, compute the product of the diagonal
elements of U and then detAsgnP

n
i1
U
ii
.
For example, Eq. (22) implies detA7. Theorem
29(10) is one of the most important properties of the
determinant. A is singular (i.e., not invertible) if and
only if detA0.
De®nition 30. Let A  M
nn
F. Assume n  2.
1. For i; j  1; FFF; n, M
ij
A will denote the n  1
n  1 submatrix of A obtained by deleting row
i and column j of A.
2. cof
ij
A1
ij
detM

ij
A:
3. adjA is the n  n matrix whose i; jth entry is
given by adjA
ij
 cof
ji
A:
The determinant of the n  1n  1 submatrix
M
ij
A is called the i; jth minor of A. The i; jth minor
of A with sign 1
ij
is cof
ij
A and is called the i; jth
cofactor of A. The matrix de®ned in 3 is called the
adjoint of A.
Theorem 30. Laplace Expansion: Let A  M
nn
F.
Then adjAA  A adjAdet AI
n
.
Linear Algebra 57
Copyright © 2000 Marcel Dekker, Inc.
Analyzing the entries in Theorem 30 gives us the
following identities:


n
j1
A
ij
cof
kj
A
ik
detA
for all i; k  1; FFF; n

n
i1
A
ij
cof
ik
A
jk
detA
for all j; k  1; FFF; n
38
here

uv

1ifu  v
0ifu  v
&
is Kronecker's delta function.

If A is invertible, then Theorem 30 implies
A
1
detA
1
adjA. Equation (11) is just
Laplace's theorem when n  2. The last elementary
result we will give concerning determinants is
Cramer's rule.
Theorem 31. Cramer's Rule: Let A 
1
 FFF 
n

M
nn
F. Let B  F
n
. Suppose A is invertible. Then
the unique solution x
1
; FFF; x
n

t
 F
n
to the system of
equations AX  B is given by
x

i
detA
1
det
1
 FFF 
i1
 B  
i1
 FFF 
n

for all i  1; FFF; n
39
3.3.2 Eigenvalues
De®nition 31
1. Let A  M
nn
F. A scalar d  F is called an
eigenvalue (or characteristic value) of A if
there is a nonzero vector   F
n
such that
A  d.
2. Let V be a vector space over F and
T  HomV; V. A scalar d  F is an eigenvalue
(or characteristic value) of T if there is a nonzero
vector   V such that Td.
Eigenvalues of matrices and linear transformations
are related to each other by diagram (27). If A is any

matrix representation of T, then d is an eigenvalue of
T if and only if d is an eigenvalue of A. For this reason,
we will present only the theory for matrices. The reader
can translate the results given here into the correspond-
ing theorems about linear transformations (on ®nite-
dimensional vector spaces) by using (27).
De®nition 32. Let A  M
nn
F. Then 
F
Ad 
F  d is an eigenvalue of A} is called the spectrum of A.
The spectrum of A could very well be empty.
Consider the following well-known example.
Example 13. Let
A 
0 1
10

 M
22
R
The matrix A represents a rotation (in the counterclock-
wise direction) of 908 in the plane R
2
. It is easy to see
A  d for some d  R implies   0. Thus,

R
A.

If we view A as a complex matrix, i.e., A  M
22
C,
then
A
1
i

 i
1
i

A
1
i

i
1
i

40
Here i 

1

. It is easy to show 
C
Ai; i.
Example 13 shows that the base ®eld F is important
when computing eigenvalues. Thus, the notation 

F
A
for the spectrum of A includes the ®eld F in the sym-
bols.
De®nition 33. Let A  M
nn
F and let X denote an
indeterminate over F. The polynomial C
A
XdetXI
n
A is called the characteristic polynomial of A.
For any matrix A  M
nn
F, the characteristic
polynomial has the following form:
C
A
XX
n
 a
1
X
n1
a
n1
X  a
n
: 41
In Eq. (41), a

1
; FFF; a
n
 F. The coef®cients a
1
; FFF; a
n
appearing in C
A
X all have various interpretations
which are related to A. For example,
a
1


n
i1
A
ii
a
n
1
n
detA42
At any rate, C
A
X is always a nonzero polynomial of
degree n whose leading term is 1. The connection
between 
F

A and C
A
X is given in our next theorem.
Theorem 32. 
F
Ad  F  C
A
d0.
Thus, the zeros of C
A
X in F are precisely the
eigenvalues of A. In Example 13, C
A
XX
2
 1.
The zeros of X
2
 1 are i; iC. Hence, 
R
A
 and 
C
Ai.
Although Theorem 32 is simple, it is only useful
when C
A
X (an n  n determinant) can be computed
58 Brown
Copyright © 2000 Marcel Dekker, Inc.

and the roots of C
A
X in F can be computed. For
large n or ``bad'' matrices A, more sophisticated meth-
ods (such as the power method or inverse power
method when F  R or C) must be employed. One
of the central problems of numerical linear algebra is
to devise iterative methods for computing eigenvalues.
A good elementary reference for these techniques is
Cullen [4].
Notice that Theorem 32 implies A has at most n
distinct eigenvalues in F . Also, Theorem 32 implies if
A is similar to B, then 
F
A
F
B.
De®nition 34. Let d 
F
A. The subspace
NSdI
n
 A is called the eigenspace of A associated
with d. The nonzero vectors in NSdI
n
 A are called
eigenvectors (or characteirstic vectors) of A associated
with d.
We will let 
A

d denote the eigenspace of A asso-
ciated with the eigenvalue d.Ifd
1
; FFF; d
r
are distinct
eigenvalues in 
F
A and 
i
is a nonzero vector in

A
d
i
, then 
1
; FFF;
r
are linearly independent over
F. This leads immediately to our next theorem.
Theorem 33. Let A  M
nn
F. A is similar to a diag-
onal matrix if and only if F
n
has a basis consisting of
eigenvectors of A.
There are many applications of Theorem 33.
Example 14. Suppose A  M

nn
F. How do we com-
pute A
k
for all k  2?IfF
n
has a basis 
1
; FFF;
n
 con-
sisting of eigenvectors of A, then the problem is easily
solved. Suppose A
i
 d
i

i
for i  1; FFF; n. Set
P 
1
 FFF 
n
. Since rkPn, P is invertible.
AP  A
1
 FFF  
n

A

1
 FFF  A
n
d
1

1
 FFF d
n

n
PD
43
Here
D 
d
1
0
F
F
F
0 d
n
H
f
d
I
g
e
Thus, A  PDP

1
and
A
k
PDP
1

k
 PD
k
P
1
 P
d
k
1
0
F
F
F
0 d
k
n
H
f
d
I
g
e
P

1
44
There are many iteration-type problems in which A
k
must be computed for all k  1. These problems are
easily solved if A has enough eigenvectors to span the
space. The reader is urged to consult Brown [1] for
other applications of eigenvalues.
3.4 INNER-PRODUCT SPACES
3.4.1 Real and Complex De®nitions
Inner products are de®ned on vector spaces de®ned
over R or C. A vector space V whose ®eld of scalars
F  RC is called a real (complex) vector space. The
de®nition of an inner product is slightly different for
the two cases.
De®nition 35. Let V be a real vector space, i.e.,
F  R. An inner product on V is a function ;  X V 
V  R which satis®es the following conditions:
1. ;  is positive for every nonzero   V:
2. x  y; x; y;  for all ; ,  
V and x; y  R.
3. ; ;  for all ;   V:
A real vector space V together with an inner pro-
duct ;  on V is called an inner-product space. We
will denote an inner-product space by the ordered pair
V; ; .
If V; ;  is an inner-product space, then 0;
0 for all   V.Also,; x  yx;y;
by 2 and 3. Hence, ;  is a bilinear function on
V  V.

Example 15
1. Let V  R
n
. De®ne ; 
t
. Here we iden-
tify the 1  1 matrix 
t
 with its single entry in
R. Then (R
n
; ; 
t
) is an inner-product
space. The function ; 
t
 is called the
standard inner product (in calculus, the dot pro-
duct) on R
n
.
2. Let V  R
n
. Let c
1
; FFF; c
n
be any positive
numbers in R.If x
1

; FFF; x
n

t
and
 y
1
; FFF; y
n

t
, set ; 



n
i1
c
i
x
i
y
i
. then
R
n
; ; 

 is an inner-product space.
Thus, a given real vector space V can have many differ-

ent inner products on it.
3. Let V  Ca; b, the continuous real-valued
functions on a closed interval a; bR.Ifwe
de®ne f ; g

b
a
f xgxdx, then V; ;  is
an inner-product space.
Linear Algebra 59
Copyright © 2000 Marcel Dekker, Inc.
De®nition 36. Let V be a complex vector space, i.e.,
F  C. An inner product on V is a function ;  X V 
V  C which satis®es the following conditions:
1. ;  is a positive real number for every nonzero
  V:
2. x  y; x; y;  for all ; ;  
V and x; y  C.
3. ; 
;  for all ;   V:
In 3,
;  denotes the conjugate of the complex
number ; . A complex vector space V together
with an inner product ;  on V will be called a com-
plex inner-product space V; ; .
If V; ;  is a complex inner-product space,
then 0;; 00 for all   V and
;x  y
"
x;

"
y;. Thus ;  is linear
in its ®rst variable and conjugate linear in its second
variable. The inner product given on C
n
by ; 
t
"
 is called the standard inner product on C
n
.
The theorems for real and complex inner products
are very similar. Usually if one erases the conjugation
symbol in a complex proof, one gets the real proof. For
this reason, we will state all future results for complex
inner-product spaces and leave the corresponding
results for real vector spaces to the reader.
Let V; ;  be a (complex) inner-product space.
We can de®ne a length function X V  R on V
for by setting


; 
p
for all   V 45
By De®nition 36(1) [or 35(1)], ; 0 for any   V.

; 

denotes the nonnegative real number whose

square is ; . Thus, 
2
; . One of the most
important inequalities in mathematics is the Cauchy±
Schwarz inequality:
;    for all ;   V 46
In Eq. (46), ;  denotes the modulus of the com-
plex number ; . The function de®ned in Eq.
(45) is called the norm associated with the inner pro-
duct ; . The norm satis®es the following inequal-
ities:
 > 0if  0 47a
00 47b
xx for all   V and x  C 47c
   for all ;   V 47d
The inequality in Eq. (47d) is called the triangle
inequality. Its proof follows immediately from the
Cauchy±Schwarz inequality.
The norm associated with the inner product ; 
de®nes a distance function d X V  V  R given by the
following equation
d;    for all ;   V 48
The distance function satis®es the following inequal-
ities:
d; 0 for all ;   V 49a
d; 0 if and only if    49b
d; d;  for all ;   V 49c
d; d; d;  for all ; ;   V
49d
Thus, any inner-product space V; ;  is a normed

vector space V; with norm given by Eq. (45) and
a metric space V; d with metric (i.e., distance func-
tion) d given by Eq. (48). Since we have a distance
function on V; ; , we can extend many results
from the calculus to V; ; . For more details, the
reader is referred to Brown [2].
3.4.2 Orthogonality
De®nition 37. Let V; ;  be an inner-product
space.
1. Two vectors ;   V are said to be orthogonal if
; 0.
2. A set of vectors 
i
 i  ÁV is said to be
pairwise orthogonal if 
i
and 
j
are orthogonal
whenever i  j.
Notice that ; 0 if and only if ; 0.
Thus,  and  are orthogonal if and only if  and 
are orthogonal.
Theorem 34. Let 
1
; FFF;
n
be pairwise orthogonal,
nonzero vectors in V; ; . Then
1. 

1
; FFF;
n
are linearly independent.
2. If   L
1
; FFF;
n
, then
 

n
j1
;
j


j
;
j



j
3. If   L
1
; FFF;
n
; then



n
j1
;
j

2

j
;
j

@A
1=2
60 Brown
Copyright © 2000 Marcel Dekker, Inc.
A set of vectors 
1
; FFF;
n
V; ;  is said to be
orthonormal if 
i
;
j
0 whenever i  j and 
i
1
for all i  1; FFF; n.If
1

; FFF;
n
are orthonormal, then
Theorem 34 implies that B 
1
; FFF;
n
 is an ordered
basis of W  L
1
; FFF;
n
. In this case, the coordinate
map 
B
X W  C
n
is particularly easy to compute. By
2 and 3, we have

B

;
1

F
F
F
;
n


H
f
f
d
I
g
g
e
for any   W  L
1
; FFF;
n

50a


n
j1
;
j

2
@A
1=2
for any   L
1
; FFF;
n


50b
The Gram±Schmidt process allows us to construct an
orthonormal basis of any ®nite-dimensional subspace
W of V; ; :
Theorem 35. Gram±Schmidt: Let 
1
; FFF;
n
be line-
arly independent vectors in V; ; . Then there exist
pairwise orthogonal vectors 
1
; FFF;
n
such that L
1
;
FFF;
j
L
1
; FFF;
j
 for j  1; FFF; n.
The vectors 
1
; FFF;
n
in Theorem 35 are de®ned
inductively as follows: 

1
 
1
. Having de®ned

1
; FFF;
r
, 
r1
is de®ned by the following equation:

r1
 
r1


r
j1

r1
;
j


j
;
j




j
51
To produce an orthonormal basis for L
1
; FFF;
n
,
replace 
1
; FFF;
n
by 
1
=
1
; FFF;
n
=
n
.
Theorem 35 can be used to construct the orthogonal
complement of a subspace W .
Theorem 36. Let V; ;  be a ®nite-dimensional
inner-product space. Let W be a subspace of V. Then
there exists a unique subspace W

 V such that
1. W W


 V.
2. W W

0.
3. Every vector in W is orthogonal to every vector in
W

.
The unique subspace W

given in Theorem 36 is called
the orthogonal complement of W and written W

.
Clearly, dimWdimW

dim V.
3.4.3 Least-Squares Problems
There are three main problems in numerical linear
algebra:
1. Find effective methods for solving linear sys-
tems of equations AX  B.
2. Find methods for computing eigenvalues of a
square matrix A.
3. Find effective methods for solving least-squares
problems.
We have already talked about the ®rst two pro-
blems. We will now consider the third problem.
Suppose W is a subspace of some inner-product
space V; ; . Let   V. Is there a vector P

W which is closest to ? In other words, is
there a vector PW such that
  P  min    W?IfV  R
n
and
; 
t
, then   
2


n
i1
a
i
 x
i

2
. Here  
a
1
; FFF; a
n

t
and  x
1
; FFF; x
n


t
 W. Finding a vec-
tor P in W which is closest to  is equivalent to
®nding x
1
; FFF; x
n

t
 W such that a
1
 x
1

2

a
n
 x
n

2
is as small as possible. Thus, we are trying
to minimize a sum of squares. This is where the name
``least-squares problem'' originates.
If dimW, there may be no vector in W which
is closest to . For a concrete example, see Brown [2, p.
212]. If W is ®nite dimensional, then there is a unique
vector P in W closest to .

Theorem 37. Let V; ;  be an inner-product space
and let W be a ®nite-dimensional subspace of V. Let
  V. Then there exists a unique vector PW
such that   P  min    W.
Furthermore, if 
1
; FFF;
n
 is any pairwise, orthogonal
basis of W, then
P

n
j1
; 
j


j
;
j



j
The unique vector P satisfying Theorem 37 is
called the orthogonal projection of  onto W. The
map P
W
 X V  V given by P

W
P for all  
V is called the orthogonal projection of V onto W.
This map satis®es the following properties:
1. P
W
 HomV; V:
2.   P
W
 is orthogonal to W for every   V.
3. ImP
W
W, KerP
W
W

.
4. P
2
W
 P
W
: (52)
Linear Algebra 61
Copyright © 2000 Marcel Dekker, Inc.
Theorem 37 has important applications in the
theory of linear equations. Suppose A  M
mn
C
and B  C

m
.
De®nition 38. A vector   C
n
is called a least-squares
solution to AX  B if A  BA  B for all
  C
n
.
Here is the induced norm from the standard
inner product ; 
t
"
 on C
n
. Thus,  is a least-
squares solution to AX  B if and only if
A  P
CSA
B. In particular, Theorem 37 guarantees
least-squares solutions always exist. If B  CSA, then
AX  B is consistent, i.e., there exists a vector   C
n
such that A  B. In this case, any least-squares solu-
tion to AX  B is an ordinary solution to the system.
Theorem 38. Let A  M
mn
C and B  C
m
. A vector

  C
n
is a least-squares solution to AX  B if and only
if
"
 is a solution to A
t
"
AX  A
t
"
B. The least-squares
solution is unique if rkAn.
The equations A
t
"
AX  A
t
"
B are called the normal
equations of A. Theorem 38 implies the solutions of
the normal equations determine the least-squares solu-
tions to AX  B. Solutions to the normal equations
when rkA < n have an extensive literature. For appli-
cations to curve ®tting, see Brown [1].
3.4.4 Normal Matrices
In this section, F  R or C. ;  will always denote
the standard inner product on F
n
. Thus, ; 

t
 if
F  R and ; 
t
"
 if F  C.IfA  M
nn
F, then
A

will denote the Hermitian conjugate of A. Thus, A

 A
t
if the entries of A are all real numbers and, in
general, A


"
A
t
. There is an important relationship
between the standard inner product and A and A

.
Theorem 39. Let A  M
nn
F. Then A; ;
A


 for all ;   F
n
.
De®nition 39. Let A  M
nn
C. A is unitary if
AA

 A

A  I
n
.
If the entries of A in De®nition 39 are all real [i.e.,
A  M
nn
R and A is unitary, then AA
t
 A
t
A  I
n
.
In this case, A is called an orthogonal matrix. The
following theorem characterizes unitary and orthogo-
nal matrices.
Theorem 40. Suppose A 
1
 FFF 
n

M
nn
C.
Then the following statements are equivalent:
1. A is unitary.
2. A; A;  for all ;   C
n
.
3. 
1
; FFF;
n
 is an orthonormal basis of C
n
.
4. A  MB; C, a change-of-basis matrix between
two orthonormal bases B and C of C
n
.
The same theorem is true with C replaced by R and
unitary replaced by orthogonal. An important corol-
lary to Theorem 40 is the following observation. If A is
unitary, then

C
Az  C z153
De®nition 40. Let A  M
nn
C. A is Hermitian if
A  A


.
Theorem 41. Let A  M
nn
C. If A is Hermitian, then

C
AR.
If the entries in A are real, then A  A

if and only if
A is symmetric. Theorem 41 implies any real, sym-
metric matrix has all of its eigenvalues in R. Here is
a handy chart of the complex and real names of some
important types of matrices:
M
nn
C M
nn
R
A unitary: A

A  I
n
A orthogonal: A
t
A  I
n
A Hermitian: A  A


A symmetric: A  A
t
A skew-Hermitian: A

AAskew-symmetric: A
t
A
These are all special cases of normal matrices.
De®nition 41. Let A  M
nn
C. A is normal if
AA

 A

A.
Theorem 42. Schur:
1. Let A  M
nn
R such that 
C
AR. Then
there exists an orthogonal matrix P such that P
t
AP is upper triangular.
2. Let A  M
nn
C. There exists a unitary matrix
P such that P


AP is upper triangular.
Notice the difference between the two theorems. If
F  C, there are no hypotheses on A. Any (square)
matrix is unitarily similar to an upper-triangular
matrix. The corresponding theorem for real matrices
cannot be true. The matrix
A 
0 1
10

 M
22
R
62 Brown
Copyright © 2000 Marcel Dekker, Inc.
is not similar to any upper-triangular matrix since

R
A. However, if all eigenvalues of A (in C in
fact are real numbers, then 1 implies A is orthogonally
similar to an upper-triangular matrix. For example,
Theorems 41 and 42(1) imply that any symmetric
matrix A  M
nn
R is orthogonally similar to a diag-
onal matrix. In fact, more is true.
Theorem 43. Let A  M
nn
C. A is normal if and only
if there exists a unitary matrix P such that P


AP is
diagonal.
In particular, Hermitian and skew-Hermitian matrices
are unitarily similar to diagonal matrices.
We conclude this section with an easy application of
Theorem 43.
Theorem 44. Let A  M
nn
C be a normal matrix.
1. A is Hermitian if and only if 
C
AR.
2. A is unitary if and only if 
C
A
z  C z1:
3.5 FURTHER READING
This chapter consists of de®nitions and theorems that
would normally be found in a junior level course in
linear algebra. For more advanced courses the reader
could try Brown [2] or Greub [5]. For an introduc-
tion to the theory of matrices over arbitrary commu-
tative rings, see Brown [3]. For a basic treatment of
numerical results, see Cullen [4]. For a more
advanced level treatment of numerical results, see
Demmel [6].
REFERENCES
1. WC Brown. Matrices and Vector Spaces. Pure and
Applied Mathematics, vol 145. New York: Marcel

Dekker, 1991.
2. WC Brown. A Second Course In Linear Algebra. New
York: John Wiley & Sons, 1988.
3. WC Brown. Matrices Over Commutative Rings. Pure
and Applied Mathematics, vol 169. New York: Marcel
Dekker, 1993.
4. CG Cullen. An Introduction to Numerical Linear
Algebra. Boston: PWS Publishing, 1994.
5. W Greub. Linear Algebra. Graduate Texts in
Mathematics, vol 23, 4th ed. New York: Springer-
Verlag, 1981.
6. J Demmel. Numerical Linear Algebra. Berkeley
Mathematics Lecture Notes, vol 1, University of
California, Berkeley, CA, 1993.
Linear Algebra 63
Copyright © 2000 Marcel Dekker, Inc.
Chapter 1.4
A Review of Calculus
Angelo B. Mingarelli
School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, Canada
4.1 FUNCTIONS, LIMITS, AND
CONTINUITY
4.1.1 Functions and Their Properties
A function is a rule which associates with each object of
one set, called the domain [denoted by the symbol
Domf ], a single object f x from a second set called
the range [denoted by the symbol, Ranf ]. All func-
tions will be real valued in this chapter. This means that
their range is always a subset of the set of all real
numbers, while their domain is always some interval.

We recall the notation for intervals; the symbol a; b
denotes the set of points fx X a < x < bg, and this is
called an open interval, while a; b represents the set
fx X a x bg, which is called a closed interval. On the
other hand, the symbols a; b; a; b each denote the
sets fx X a < x bg and fx X a x < bg, respectively
(either one of these is called a semiopen interval). The
rules f xx
3
; gxcos x; hx

x
p
are various
examples of functions, with hx being de®ned only
when x ! 0. The sum of two functions, f , g, say, is
de®ned by the rule f  gxf xgx with a
similar de®nition being applied to the difference. The
operation known as the product of two functions, f , g,
say, is now de®ned by the rule fgxf xgx.
For example, with f , g as above, their sum,
f gxx
3
 cos x, whereas their product
fgxx
3
cos x. The quotient of two functions is
only de®ned when the denominator is nonzero. In gen-
eral, f =gxf x=gx represents the quotient of
f ; g, while in our case, f =gxx

3
sec x, which is
only de®ned when cos x T 0. When c is a constant (a
real number), the symbol cf is de®ned by
cf xcf x. In particular, the identity function,
denoted by the symbol ``1,'' is de®ned by the rule
1xx. An important function in calculus is the so-
called absolute value function; it is de®ned by the rule:
jxjx; x ! 0, while, if x < 0; jxjÀx. In either case,
the absolute value of a number is that same number (if
it is positive) or the original unsigned number (with its
minus sign changed to a plus sign). Thus,
jÀ5j  ÀÀ55, while j3:45j3:45. When using
square roots we will always take it that

x
2
p
jxj,
for any x.
Another operation which is available on two speci-
®ed functions is that of composition. We recall this
notion here: given two functions, f ; g where the
range of g is contained in the domain of f , we de®ne
the composition of f and g, denoted by the symbol f g,
whose values are given by f gxf gx.Asan
example, let f xx
2
 1; gxx À 1. then
f gxf gx  gx

2
 1 x À 1
2
 1. On the
other hand, g  f xgf x  f xÀ1  x
2
and
this shows that the operation of composition is not
commutative, that is, g  f x Tf  gx, in general.
Let f ; F be two given function with domains,
Domf ,DomF, and ranges, Ranf , RanF .We
say that f (resp. F) is the inverse function of F (resp.
f) if both their compositions give the identity function,
that is, if f FxF  f xx [and, as is
usual, Domf RanF and DomFRanf ].
65
Copyright © 2000 Marcel Dekker, Inc.
Sometimes this relation is written as
f f
À1
xf
À1
 f xx. For instance, the func-
tions f ; F de®ned by the rules f xx
2
and
Fx

x
p

are inverses of one another because their
composition is the identity function. In order that two
functions f ; F be inverses of one another it is neces-
sary that each function be one-to-one on their respec-
tive domains. This means that the only solution of the
equation f xf y [resp. FxFy] is the solu-
tion x  y, whenever x; y are in Domf , [resp.
DomF]. The simplest geometrical test for deciding
whether a given function is one-to-one is the so-called
horizontal line test. Basically, one looks at the graph
of the given function on the xy-plane, and if every
horizontal line through the range of the function
intersects the graph at only one point, then the func-
tion is one-to-one and so it has an inverse function.
The graph of the inverse function is obtained by
re¯ecting the graph of the original function in the
xy-plane about the line y  x.
At this point we introduce the notion of the inverse
of a trigonometric function. The graphical properties
of the sine function indicate that it has an inverse
when DomsinÀ=2;=2. Its inverse is called
the arcsine function and it is de®ned for À1 x 1
by the rule that y  arcsin x means that y is an angle
whose sine is x. Thus arcsin1=2, since
sin=21. The cosine function with
Domcos0; has an inverse called the arccosine
function, also de®ned for À1 x 1, whose rule is
given by y  arccos x which means tht y is an angle
whose cosine is x. Thus, arccos10, since
cos01. Finally, the tangent function de®ned on 

À=2;=2 has an inverse called the arctangent func-
tion de®ned on the interval ÀI; I by the state-
ment that y  arctan x only when y is an angle in
À=2;=2 whose tangent is x. In particular,
arctan1=4, since tan=41. The remaining
inverse trigonometric functions can be de®ned by
the relations y  arccot x, the arccotangent function,
only when y is an angle in 0; whose cotangent is
x (and x is in ÀI; I). In particular,
arccot0=2, since cot=20. Furthermore,
y  arcsec x, the arcsecant function, only when y is
an angle in 0;, different from =2, whose secant
is x (and x is outside the closed interval À1; 1). In
particular, arcsec10, since sec 0  1. Finally,
y  arccsc1x, the arccosecant function, only when y
is an angle in À=2;=2, different from 0, whose
cosecant is x (and x is outside the closed interval
À1; 1). In particular, arccsc1=2, since
csc=21. Moreover,
sinarcsin xx; À1 x 1 arcsinsin xx;
À =2 x =2 1
cosarccos xx; À1 x 1 arccoscos xx
0 x  2
tanarctan xx; ÀI < x < I
arctantan xx; À=2 < x <=2 3
cotarccot xx; ÀI < x < I
arccotcot xx; 0 < x < 4
secarcsec xx jxj!1 arcsecsec xx;
0 x ; x T =2 5
cscarccsc xx; jxj!1 arccsccsc xx;

À=2 x =2; x T 0 6
arccos x  arcsin x  =2; À1 x 1 7
arccot x  arctan x  =2; ÀI < x < I 8
arcsec x  arccsc x  =2; jxj!1 9
sinarccos xcosarcsin x

1 À x
2
p
;
À1 x 1 10
Note that other notations for an inverse function include
the symbol f
À1
for the inverse function of f, whenever it
exists. This is not to be confused with the reciprocal
function. We used F in this section, and arcsinx
instead of sin
À1
x, in order to avoid this possible con-
fusion.
A relation between two variables say, x; y, is said to
be an implicit relation if there is an equation connecting
the two variables which forms the locus of a set of
points on the xy-plane which may be a self-intersecting
curve. For example, the locus of points de®ned by the
implicit relation x
2
 y
2

À 9  0 forms a circle of
radius equal to 3. We can then isolate one of the vari-
ables x or y, say x, call it an independent variable and
then have, in some cases, y being a function of x (y
then is called a dependent variable, because the value of
y depends on the actual value of x chosen). When this
happens we say that y is de®ned implicitly as a function
of x or y is an implicit function of x. In Sec. 4.2.1 we will
use the chain rule for derivatives to ®nd the derivative
of an implicit function.
66 Mingarelli
Copyright © 2000 Marcel Dekker, Inc.

×