Chapter A
P re lim in ar ies of R ea l A n a ly sis
A principal objective of this largely rudimentary chapter is to in troduce the basic
set-theoretical nomenclature that w e ad opt throug ho ut the text. We start w ith an
intuitive discussion of the notion of “set,” and then introduce the basic operations
on sets, Cart esian products, and bin ary relations. After a quic k exc ursion to order
theory (in which th e o n ly r elatively a d vanced top ic that we cover is the comp letion
of a partial order), f u nctions are introduced as special cases of binary relations, and
sequences as special cases of functions. Our co verage of abstract set theory concludes
with a brie f dis cu s sion of the Axiom of Choice, an d the proof of Sz iplr ajn’s T he or e m
on the completion of a partial order.
We assum e here that the reade r is familiar with the elementary properties of
the real n umbers, and th us pro vide only a heuristic discussion of the basic number
systems. No cons tr u ction for the integ ers is given , in particular. Af t er a short elabo-
ration on ordered fields and the Com pletene ss Axiom , w e note without proof that the
rational numbers form an ordered field and real numbers a complete o r dered field.
The related discussion is intended to be read more quickly than anywhere else in the
text.
We next turn to real sequenc es. Thes e we discuss relatively thoroughly because
of the important role they play in real analysis. In particular, even though our
coverage w ill serv e only as a review for most readers, we study here the monotonic
seque n ce s and subseq u ential limits with some care, an d prove a few use fu l results lik e
the Bolzano-Weierstrass Theorem and Dirichlet’s Rearrangement Theorem. These
results will be used freely in the remainde r of the text.
The final section of the c hapter is nothing more than a swift refresher on the
analysis of real functions. First w e recall some basic de finitions, and then, v ery
quickly, go ov e r the concepts of limits and continuit y of real functio ns defined on
the real line. We then review the elementary theory of differentiation for single-
variable functions, but that, mostly through exercises. The primer we present on
Riem an n in t eg rat ion is a bit more leisurely. In particula r , we give a comp lete proof
of the Fundamen tal Theorem of Calculus whic h is used in the remainder of the text
freely. We invoke our calculus review to outline a basic analysis of exponential and
logarithmic real functions. These maps are used in many examples throughout the
text. The chapter concludes with a brief discussion of the theory of conca v e functions
on the real line.
1
1 Elemen ts of Set Theory
1.1 Sets
Intuitiv e ly speaking, a “set” is a collection of objects.
1
Th e distinguishing featu r e of
a “set” is that while it m ay contain numerous objects, it is nevertheless conceived
as a single entity. In the words of G eorg Cantor, t he great founder of abstract set
theory, “a set is a Ma ny which allows itself to be thou ght of as a O ne.” It is amazing
how m uch follows from this simple idea.
The objects that a set S contains are called the “elements” (or “members”) of
S. Clearly, to know S, it is necessar y and s ufficient to know all elements of S. The
principal c oncept of set theory is, then , the relation of “being an elem ent/member
of.” The universally accepted symbol for this relation is ∈, that is, x ∈ S (or S x)
means that “x is an element of S” (also read “x is a me mber of S, ” or “x is containe d
in S, ” or “x belongs to S, ” or “x is in S, ” or “S includes x, ” etc.). We often write
x, y ∈ S to d e no te that both x ∈ S and y ∈ S hold. For an y natural n um ber m,
a statement lik e x
1
, , x
m
∈ S (or equivale ntly, x
i
∈ S, i =1, , m) is understood
analogously. If x ∈ S is a false statement, then we write x/∈ S,andread“x is not
an elemen t of S.”
If the sets A and B have exactly the same elements, that is, x ∈ A iff x ∈ B, then
we say that A and B are identical, and write A = B, otherwise we write A = B.
2
(So, for instance, {x, y} = {y, x}, {x, x} = {x}, and {{x}} = {x}.)Ifeverymember
of A isalsoamemberofB, then we say that A is a subset of B (also read “A is a
set in B,” or “A is contained in B”) an d write A ⊆ B (or B ⊇ A). Clearly, A = B
holds iff both A ⊆ B and B ⊆ A hold. If A ⊆ B but A = B, then A is said to be a
proper subset of B, and we denote this situation b y writing A ⊂ B (or B ⊃ A).
For any set S that con tain s finitely many elem ents (in whic h case we sa y S is
finite),wedenoteby|S| the tot al n u mber of elements that S contains, and refer to
this number as th e cardinality of S. We say that S is a singleton if |S| =1. If S
contain s infinitely man y elements (in whic h case we sa y S is infinit e ),thenwewrite
|S| = ∞. Ob viously, we ha v e |A| ≤ |B| whenev er A ⊆ B, and if A ⊂ B and |A| < ∞,
then |A| < | B| .
We sometimes specify a set b y enumerating its elements. For instance, {x, y, z}
is the set that consists of the objects x, y and z. The conten ts of th e sets {x
1
, , x
m
}
and {x
1
,x
2
, } are simila r ly described. For example, the set N of positiv e integers
canbewrittenas{1, 2, }. Alternatively, one may d escribe a set S as a collection of
all objects x that satisfy a given property P. If P (x) stand s for the (logical) statement
“x satisfies the property P, ” then we can write S = {x : P (x) is a true statemen t }
or simply S = {x : P(x)}. If A is a set and B is the set that contains all elemen ts x
1
The notion of an “object” is left undefined, that is, it can be giv en any meaning. All we demand
of our “ objects” is that they be logically distinguishable.Thatis,ifx and y are two o bjects, x = y
and x = y cannot hold simultaneously, and that the statement “either x = y or x = y” is a tautology.
2
Reminder. iff = if and only if.
2
of A suc h that P (x) is tru e, we write B = {x ∈ A : P(x)}. Fo r instance, where R is
the set of all real numbers, the collection of all real numbers greater than or equal to
3 canbewrittenas{x ∈ R : x ≥ 3}.
The sym bol ∅ denotes the empt y set, that is, the set that contains no elements
(i.e. |∅| =0). Formally speaking, w e can define ∅ as the set {x : x = x}; for this
description entails that x ∈∅is a false statement for any object x. Consequen tly, we
write
∅ := {x : x = x},
meaning that the sym bol on the left hand sid e is defined by that on t he righ t hand
side.
3
Clearly, we have ∅⊆S for an y set S, whic h, in particular, im p lies that ∅ is
unique. (W hy? ) If S = ∅, we say that S is nonempty. For instance, {∅} is a
nonempty set. Inde ed, {∅} = ∅ — the former, a fter all, is a set o f sets that contains
the e mpt y set, while ∅ contains nothing. (An em p ty bo x is no t the sam e thing as
nothing!)
We define the class of all subsets of a give n set S as
2
S
:= {T : T ⊆ S},
which is called the power set of S. (The c h oice of notation is motivated by the fact
that th e power set of a set that contains m element s has exactly 2
m
elements.) For
instance, 2
∅
= {∅}, 2
2
∅
= {∅, {∅}}, and 2
2
2
∅
= {∅, {∅}, {{∅}}, {∅, {∅}}}, and so on.
Notation. Throughout this text, the class of all nonempt y finite subsets of any
giv en set S is denoted by P(S), that is,
P(S):={T : T ⊆ S and 0 < |T| < ∞}.
Of course, if S is finite, then P(S)=2
S
\{∅}.
Given any tw o sets A and B, by A ∪ B we mean the set {x : x ∈ A or x ∈ B}
which is called the union of A and B. The intersection of A and B, denoted as
A ∩ B, is defined as the set {x : x ∈ A and x ∈ B}. If A ∩ B = ∅, we say that A and
B are disjoint. Ob v iou sly, if A ⊆ B, then A ∪B = B and A ∩B = A. In particular,
∅∪S = S an d ∅∩S = ∅ for an y set S.
Taking unions and intersections are comm utative operations in the sense that
A ∩ B = B ∩A and A ∪B = B ∪A
for any sets A and B. They are also associativ e,thatis,
A ∩ (B ∩ C)=(A ∩ B) ∩ C and A ∪(B ∪C)=(A ∪ B) ∪ C,
3
Recall my notational convention: For any symbols ♣ and ♥, either one of the expressions ♣
:= ♥ and ♥ =: ♣ means that ♣ is defined by ♥.
3
and distributive,thatis,
A ∩ (B ∪ C)=(A ∩ B) ∪(A ∩ C) and A ∪ (B ∩C)=(A ∪ B) ∩ (A ∪ C),
for any sets A, B and C.
Exercise 1. Prove the commutativ e, associative and distributive la w s of set theory
stated abo ve.
Exercise 2. Given any t wo sets
A and B, by A\B —thedifference between A and
B —wemeantheset{x : x ∈ A and x/∈ B}.
(a) Show that S\∅ = S, S\S = ∅, and ∅\S = ∅ for any set S.
(b)ShowthatA\B = B\A iff A = B for any sets A and B.
(c)(De Morgan Laws) Prove: For any sets A, B and C,
A\(B ∪ C)=(A\B) ∩ (A\C) and A\(B ∩C)=(A\B) ∪ (A\C).
Throughout this text we use the terms “class” or “family” only to refer to a
nonempty c ollec t ion of s ets . So if A is a class, we understand that A = ∅ and that
an y mem ber A ∈ A is a set (which may itse lf be a collect ion of sets ). The union of
all members of this class, denoted as
V
A, or
V
{A : A ∈ A}, or
V
A∈A
A, is defined
as the set {x : x ∈ A for some A ∈ A}. Sim ilarly, the int erse ction of all sets in A,
denote d as
W
A, or
W
{A : A ∈ A}, or
W
A∈A
A, is de fined as the set {x : x ∈ A for
each A ∈ A}.
A com mo n way of specifying a class A of sets is b y designating a set I as a set
of indices, and by defining A := {A
i
: i ∈ I}. In this case,
V
A may be denoted as
V
i∈I
A
i
.IfI = {k, k +1, , K} for som e integers k and K with k<K,then we often
write
V
K
i=k
A
i
(or A
k
∪···∪A
K
) for
V
i∈I
A
i
. Similarly, if I = {k,k +1, } for some
integer k, then we may write
V
∞
i=k
A
i
(or A
k
∪A
k+1
∪···)for
V
i∈I
A
i
.Furthermore,for
brev ity, w e frequently de note
V
K
i=1
A
i
as
V
K
A
i
, and
V
∞
i=1
A
i
as
V
∞
A
i
, throughout
the text. Sim ilar notation al conv e ntions apply to intersections of sets as well.
Warning. The symbols
V
∅ and
W
∅ are left undefined (muc h the same way the
symbol 0/0 is undefined in number theory).
Exercise 3.LetA be a set and B a class of sets . Prove that
A ∩
V
B =
V
{A ∩ B : B ∈ B} and A ∪
W
B =
W
{A ∪ B : B ∈ B},
while
A\
V
B =
W
{A\B : B ∈ B} and A\
W
B =
V
{A\B : B ∈ B}.
4
A word of caution may be in order before we proceed further. While duly in tuitive,
the “set theory” we outlined so far pro vid e s us with no demarcation criterion for
ident ifying what exactly constitutes a “set.” This may suggest that one is com p letely
free in deeming any given collect ion of o bjects as a “set.” B ut, in fact, this wou ld be
a prett y bad idea that w ould entail serious foundational difficulties. The best kno wn
example of such d ifficulties w as giv en by Bertrand Russell in 1902 when he asked if
the set of all objects that are not members of themselves is a set: Is S := {x : x/∈ x}
aset?
4
There is nothing in our intuitive discussion above that forces us to conclude
that S is not a set; it is a collect ion of objects (sets in this case) which is considered
asasingleentity. ButwecannotacceptS asa“set,”forifwedo,wehavetobeable
to a nswer the q uestion: Is S ∈ S? If the answer is yes, then S ∈ S, bu t this implies
S/∈ S by definition of S. If the answer is no, then S/∈ S, but this implies S ∈ S
by definition of S. That is, w e ha v e a con tradictory state of affairs no matter what!
This is the so-called R u sse ll’s paradox which started a severe foundational crisis for
mathe matics that ev e ntually led to a com p lete axiomatiz ation of set theory in the
ear ly twentieth century.
5
Rou gh ly speakin g, this paradox wou ld arise only if w e allow e d “unduly large”
collections to be qualified as “sets.” In particular, it will not cause an y harm for
the math e matical analysis th at will c on c ern us here, precisely because in all of our
discussions, w e will fix a universal set of objects, say X, and consider sets like {x ∈
X : P (x)}, where P (x) is an un amb iguous logical statement in terms of x. (We w ill
also have occasion to work w ith sets of suc h sets, and sets of sets of such sets, and so
on.) Once suc h a domain X is fixed, Russell’s paradox cannot arise. Wh y, you ma y
ask, can’t we hav e the same problem with the set S := {x ∈ X : x/∈ x}? No, because
now w e can an swer the question “Is S ∈ S?”. The answ er is no! The statement S ∈ S
is false, simply because S/∈ X. (For, if S ∈ X w as the case, then w e would end up
with the con trad iction S ∈ S iff S/∈ S.)
So w hen the context is clear (that is, when a universe of objects is fixed), and when
we define our sets as just explained, Russell’s p aradox w ill not be a threat against
the resulting set theory. But can there be an y other parado x es? Well, there is really
not an easy answer to this. To ev en discuss the matter unam biguously, w e m ust lea ve
our intuitiv e understanding of the notion of “set,” a nd address the problem through
a completely axiomatic approach (in whic h we would leave the expression “x ∈ S”
as undefined, and giv e meaning to it on ly through axioms). This is, of course, not
4
While a bit unorthodox, x ∈ x may well be a statement t ha t is true for some objects. For
instance, the collection o f all sets that I have men tion ed in my life, say x, is a set that I hav e just
men tioned, so x ∈ x. But the collection of all cheeseca kes I have eaten in my life, say y, is not a
c heesecake, so y/∈ y.
5
Russell’s paradox is a classical example of the dangers of using self-referential statemen ts care-
lessly. Another example of this form is the ancient paradox of the liar : “E verything I sa y is false.”
This statement can be declared neither true nor false! To get a sense of s ome other kinds of para-
doxes and the way axiomatic set theory av oids them, you migh t want to read the popular account
of Rucker (1995).
5
at all the place to do this. Moreov er , the “intuitive” set theory that w e covered
here is more than enough for the mathematical analysis to come. We th us lea ve this
topic by referring the reader who wishes to get a broader introduction to abstract
set theory to Chapter 1 o f Sc h ech ter (1997) or Marek and Mycielski (20 01); both of
these expositions provide nice in t roductory overviews of axiomatic set theory. If you
w ant to dig deeper, then try the first three chapters of Enderton (1977).
1.2 R e lation s
An ordered pair is an ordered list (a, b) consist in g o f two objects a and b. This
list is ordered in the sense that, as a defining feature of the notion of an ordered
pair, we assum e the following : For an y tw o order ed pairs (a, b) and (a
,b
), we have
(a, b)=(a
,b
) iff a = a
and b = b
.
6
The (Cartesian) product of two nonempty sets A and B, denoted as A × B, is
defined as the set of all ordered pairs (a, b) where a comes from A and b comes from
B. That is,
A × B := {(a, b):a ∈ A and b ∈ B}.
As a notatio na l convention, we often write A
2
for A × A. It is easily seen that taking
the C artesian product of two sets is not a commutativ e operation. Ind eed, for an y two
distinct objects a an d b, we have {a}×{b} = {(a, b)} = {(b, a)} = {b}×{a}. Formally
speaking, it is not associative either, for (a, (b, c)) is not the same thing as ((a, b),c).
Yet there is a natural correspondence between the elements of A × (B × C) and
(A × B) × C, so one can really think of these tw o sets as the same, thereby rendering
the status of the set A × B × C un am biguous.
7
This prompts us to define an n-
vector (for any natural n u mber n) as a list (a
1
, , a
n
) with the understanding that
(a
1
, , a
n
)=(a
1
, ,a
n
) iff a
i
= a
i
for each i =1, , n. The (Cartesian) product of
n sets A
1
, , A
n
, is then defined as
A
1
×···×A
n
:= {(a
1
, , a
n
):a
i
∈ A
i
,i=1, ,n}.
We often write X
n
A
i
to denote A
1
×···×A
n
, and ref er to X
n
A
i
as the n-fold product
6
This defines the notion of an ordered pair as a new “primitive” for our set theory, but in fact,
this is not really necessary. One can define an ordered pair by using only the concept of “set” as
(a, b):={{a}, {a, b}}. With this definition, which is due to Kazimierz Kuratowski, one can prove
that, for any two ordered pairs (a, b) and (a
,b
), we have (a, b)=(a
,b
) iff a = a
and b = b
.The
“if” part of the claim is trivial. To prov e the “ only if” part, observe that (a, b)=(a
,b
) enta ils
that either {a} = {a
} or {a} = {a
,b
}. But the latter equality ma y hold only if a = a
= b
, so
we have a = a
in all contingencies. Therefore, (a, b)=(a
,b
) entails that either {a, b} = {a} or
{a, b} = {a, b
}. Thelattercaseispossibleonlyifb = b
, while the former possibility arises only
if a = b. But if a = b, then we have {{a}} =(a, b)=(a, b
)={{a}, {a, b
}} which holds only if
{a} = {a, b
}, that is, b = a = b
.
Quiz. (Wiener) Show that we would also hav e (a, b)=(a
,b
) iff a = a
and b = b
, if we instead
defined (a, b) as {{∅, {a}}, {{b}}}.
7
What is this “natural” correspondence?
6
of A
1
, , A
n
. If A
i
= S for each n, we then write S
n
for A
1
×···×A
n
, that is, S
n
:=
X
n
S.
Exercise 4. For any sets A, B, and C,provethat
A × (B ∩C)=(A × B) ∩(A × C) and A × (B ∪C)=(A × B) ∪ (A × C).
Let X and Y be two nonempty sets. A subset R of X × Y is ca lled a (binary)
relation from X to Y. If X = Y, that is, if R is a relation from X to X, we
simply say that it is a relation on X. Put differently, R is a relation on X iff
R ⊆ X
2
. If (x, y) ∈ R, then we thin k of R as associating the object x with y, and
if {(x, y), (y,x)} ∩ R = ∅, we understand that there is no connection between x and
y as envisaged b y R. In concert with this interpretation, we adopt the con ven tion of
writing xRy ins t ead of (x, y) ∈ R throughout this text.
Dhilqlwlrq. ArelationR on a non em pty set X is said to be reflexiv e if xRx for
each x ∈ X, and comple te if eithe r xRy or yRx holds for eac h x, y ∈ X. It is said
to be symmetric if, for an y x, y ∈ X, xRy implies yRx, and antisymmetric if, for
an y x, y ∈ X, xRy and yRx im ply x = y. Finally, w e say that R is transitive if xRy
and yRz imp ly xRz fo r any x, y, z ∈ X.
The interpretations of these properties are straightforw ard, so we do not elaborate
on them here. But note: While ev ery complete relation is reflexive, there are no other
logical implications bet ween these properties.
Exercise 5.LetX be a nonempty set, and R arelationonX. The inverse of R is
defined as the relation
R
−1
:= {(x, y) ∈ X
2
: yRx}.
(a)IfR is symmetric, does R
−1
have to be also symmetric? Antisymmetric? Tr an-
sitive?
(b)Showthat
R is symmetric iff R = R
−1
.
(c)IfR
1
and R
2
are two relations on X, the composition of R
1
and R
2
is the
relation
R
2
◦ R
1
:= {(x, y) ∈ X
2
: xR
1
z and zR
2
y for some z ∈ X}. Show that R
is transitive iff R ◦ R ⊆ R.
Exercise 6.ArelationR on a nonempty set X is called circular if xRz and zRy
imply yRx for any x, y, z ∈ X. Prove that R is reflexiv e and circular iff it is reflexive,
symmetric and transitive.
Exercise 7 .
H
Let R be a reflexive relation on a nonempty set X. The asymmetric
part of
R is de fined as the relation P
R
on X as xP
R
y iff xRy but not yRx. The
relation
I
R
:= R\P
R
on X is then called the symmetric part of R.
(a)Showthat
I
R
is reflexiv e and sym metric.
(b) Show that
P
R
is neither reflexive nor symmetric.
(c)Showthatif
R is transitive, so are P
R
and I
R
.
7
Exercise 8 . Let R bearelationonanonemptysetX. Let R
0
= R, and for each
positive integer
m, define the relation R
m
on X by xR
m
y iff there exist z
1
, , z
m
∈ X
such that xRz
1
,z
1
Rz
2
, , z
m−1
Rz
m
and z
m
Ry. The relation tr(R):=R
0
∪R
1
∪···
is called the transitive closure of R. Show that tr(R) is transitive, and if R
is a
transitive relation with
R ⊆ R
, then tr(R) ⊆ R
.
1.3 Equivalence Relations
In mathematical analysis, on e often nee ds to “ide ntify” two distinc t objects when
they possess a particular propert y of in terest. Naturally, s uc h an iden tification scheme
should satisfy c erta in con sistency c onditions. Fo r instance, if x is identified w ith y,
then y must be identified with x. S imilarly, if x and y are deem ed identical, and so
are y and z, then x an d z should be identified. Suc h considerations lead us to the
notion of equivalence relation.
Dhilqlwlrq. Arelation∼ on a nonemp ty set X is called an equiva lence relation
if it is reflexive, symmetric and transitive . For an y x ∈ X, the equiva lence class of
x relative to ∼ is defined as the set
[x]
∼
:= {y ∈ X : y ∼ x}.
The class of all equivalence classes relative to ∼,denotedasX/
∼
, is called the quo-
tient set of X relative to ∼,thatis,
X/
∼
:= {[x]
∼
: x ∈ X}.
Let X denote th e set of all people in the w o rld . “Being a sibling of” is an equ iva-
lence relation on X (pro vided that we adopt the conv en tion o f saying that any person
is a sibling of hims elf). The equivalence class of a person relative to this relation
is the set of all of his/her siblings. On the other han d, y ou would prob ably agree
that “being in love with” is n ot an eq u ivalence relation on X. Here are some more
examples (that fit better with the “serious” tone of this course).
E{dpsoh 1. [1 ] For any no nempty set X, the diagonal relation D
X
:= {(x, x):x ∈
X} is th e smallest equiva le n ce relation that can be defined on X (in the sense that
if R is any other equivalence relation on X, we have D
X
⊆ R). Clearly, [x]
D
X
= {x}
for each x ∈ X.
8
At the other extreme is X
2
which is the largest eq u ivalence relation
that can be defined on X. We have [x]
X
2
= X for each x ∈ X.
[2] B y Exer cise 7, the symmetric pa r t of any r e flexiv e and transitive relation on a
nonempt y set is an equivalence relation.
8
I say an equally suiting name for D
X
is the “equality relation.” W hat do you think?
8
[3] Let X := {(a, b):a, b ∈ {1, 2, }}, and define the relation ∼ on X by
(a, b) ∼ (c, d) iff ad = bc. It is readily v erified that ∼ is an equiva lence relation
on X, and that [(a, b)]
∼
=
(c, d) ∈ X :
c
d
=
a
b
for eac h (a, b) ∈ X.
[4] Let X := { , −1, 0, 1, } an d define the r elation ∼ on X by x ∼ y iff
1
2
(x−y) ∈
X. It is ea sily ch e cked that ∼ is an equivalence relation o n X.Moreover,forany
integer x, we have x ∼ y iff y = x −2m for some m ∈ X, and hence [x]
∼
equals the
set of all even integers if x is even, and that of all odd integers if x is odd.
One typically uses an equivalence relatio n to simplify a situation in a way that
all things that are indistinguishable from a particular perspective are put together in
a set and treated as if they are a single en tity. For instance, suppose that for som e
reason w e are interested in the signs of people. T hen, any two individuals who are
of th e same sign can be thou ght of as “identical,” so instead of the set of all people
in the world, we w ould rather work with the set of all Capricorns, all Virgos and so
on. But the set of all Capricorns is of course n on e oth er th an the equivale nce class o f
an y giv en Capricorn person r elative t o t h e e qu ivalence r elation of “being o f th e same
sign.” So wh en some on e say s “a Capricorn is ,” th en one is really referring to a
whole class of people. The e qu ivalence relation of “being of the sam e s ign ” div id es
the world into tw e lve equivalence classes, and we c an th en t alk “as if” there are only
twelve individuals in our context of reference.
To tak e anothe r exa mp le , ask yourself how you would define the set of positiv e
rational numbers, giv en the set of natural numbers N := {1, 2, } and the operation
of “multiplication .” Well, you m ay say, a po sitive ration al nu mber is the ratio of two
natural n u mbers. But wait, what is a “ra tio”? Let us be a b it more careful about
this. A better wa y of looking at things is to say that a positive rational n umber
is an ordered pair (a, b) ∈ N
2
, altho ugh, in daily practice, w e write
a
b
instea d of
(a, b). Yet, we don’t wan t to say that each ordered pair in N
2
is a distinct rationa l
nu mber. (We would lik e to think of
1
2
and
2
4
as the same num ber, for instance.) So
we “iden tif y” all those ordered pairs who we wish to associate with a single rational
nu mber by using the equivalence relation ∼ introduced in Example 1.[3],andthen
define a rational number simply as an equivalence class [(a, b)]
∼
. Of course, when w e
talk about rational numbers in daily practice, we simply talk of a fraction like
1
2
,
not [(1, 2)]
∼
, ev en thou gh , formally speaking , what we really mean is [(1, 2)]
∼
. The
equa lity
1
2
=
2
4
is obvious, precisely because the rational numbers are constructed as
equiva lence classes such that (2, 4) ∈ [(1, 2)]
∼
.
This discussion suggests that an equivalen ce relation ca n be used to decompose a
grand set of interest in to subsets suc h that the mem bers of the same subset are though t
of as “identical” while the members of distinct subsets are viewed as “distinct.” Let
us no w formalize this intuition. By a partition of a nonempt y set X, we mean a
class of p airwise disjoin t, nonempty subsets of X whose union is X. That is, A is a
partition of X iff A ⊆ 2
X
\{∅},
V
A = X and A∩B = ∅ for eve r y di stin ct A and B in
9
A. The next result says that the set of e quivalence classes induced by an y equivalence
relation on a set is a partition of that set.
Proposition 1. For a n y equivalence relation ∼ on a nonempty set X, the quotient
set X/
∼
is a partition of X.
Proof. Tak e any nonempt y set X and an equivalence relation ∼ on X. Since ∼ is
reflexive, we ha ve x ∈ [x]
∼
for each x ∈ X. Th us any mem ber of X/
∼
is nonempty,
and
V
{[x]
∼
: x ∈ X} = X. Now suppose that [x]
∼
∩ [y]
∼
= ∅ for some x, y ∈ X.
We wish to show that [x]
∼
=[y]
∼
.Observefirst that [x]
∼
∩ [y]
∼
= ∅ implies x ∼ y.
(Indeed, if z ∈ [x]
∼
∩[y]
∼
, then x ∼ z an d z ∼ y by symmetry of ∼,sowegetx ∼ y
b y transitivity of ∼.) This implies that [x]
∼
⊆ [y]
∼
, because if w ∈ [x]
∼
, then w ∼ x
(b y symmetr y of ∼), an d hence w ∼ y by transitivity of ∼. The con verse con tainment
is pro ved analogously.
The follow ing exercise shows that the conv erse of Proposition 1 also holds. Thus
the notions of equivalenc e relation and partition are really t wo different w ays of
looking at the same thing.
Exercise 9.LetA be a partition of a nonempty set X, and consider the relation ∼ on
X defined b y x ∼ y i ff {x, y} ⊆ A for some A ∈ A. Prove that ∼ is an equivalence
relation on
X.
1.4 Or d er Relations
Transitivity pr opert y is the defining feature of any or d er relation. Such relations
are given various names depending on the properties they possess in addition to
transitivit y.
Dhilqlwlrq. Arelation on a non empty set X is called a preorder on X if it is
transitiv e and reflexive. It is said to be a partial order on X if it is an antisymme tric
preorder on X. Finally, is ca lled a line a r order on X if it is a partial order o n X
which is complet e.
By a preordered set we mean a list (X, ) where X is a non em pty set and
is a preorder on X. If is a partial order on X, then (X, ) is called a poset (short
for partial ly ordered set ), an d if is a linear order on X, th en (X,) is called either
a chain or a loset (short for linearly or dered set).
While a preordered set (X, ) is not a set, it is conv e n ient to talk as if it is
a set when referring to properties that apply only to X. For instance, by a “finite
preordered set,” w e understand a preordered set (X, ) with |X| < ∞. Or, when we
10
say that Y is a subset of the preordered set (X, ), we mean simply that Y ⊆ X. A
similar conv e ntion applies to posets and losets as well.
Notation. Let (X, ) be a preordere d set. Unless otherwise is stated explicitly, we
denote by theasymmetricpartof, and b y ∼ the symmetric part of (Exercise
7).
The main distinction betw een a preorder and a partial ord er is that the form er
ma y hav e a large symmetric part, while the symmetric part of the latter mu st equal
the diagonal relation. As we shall see, ho wever, in most application s this distinction
is immaterial.
E{dpsoh 2. [1 ] Fo r any nonempt y set X, the diagona l relation D
X
:= {(x, x):x ∈
X} is a partial order on X. In fact, this relation is th e only partial order on X which
is also an equivalence relation. (Wh y ?) The relation X
2
is, on the other hand, a
complete preorder, whic h is not antisymmetric unless X is a singleton .
[2] Fo r any nonempty set X, the equalit y relation = and the subsethood relation
⊇ are p artia l orders on 2
X
. The equality relation is not linear, an d ⊇ is not linear
unless X is a singleton.
[3] (R
n
, ≥) is a poset for any positive integer n, where ≥ is defined coordinatewise,
that is, (x
1
, , x
n
) ≥ (y
1
, , y
n
) iff x
i
≥ y
i
for each i =1, ,n. When w e talk of R
n
without specifying explicitly an alterativ e or de r, we alw ays have in min d this partial
order (wh ich is sometimes called the natural (or canonical) order of R
n
). Of
course, (R, ≥) is a loset.
[4] Ta ke an y positiv e in teger n, and preordered sets (X
i
,
i
), i =1, ,n. The
product of the preordered sets (X
i
,
i
), denoted as A
n
(X
i
,
i
), is the preordered set
(X, ) with X := X
n
X
i
and
(x
1
, , x
n
) (y
1
, , y
n
) iff x
i
i
y
i
for all i =1, , n.
In particular, (R
n
, ≥)=A
n
(R, ≥).
E{dpsoh 3. In individual choice theory, a preference relation on a nonempty
alternative set X is defined as a preorder on X. Here the reflexiv ity is a trivial
condition to require, and transitivit y is view ed as a fundam ental rationality postulate.
(MoreonthisinSectionB.4.) Thestrict pr eference relation is defined as the
asymmetric part of (Exercise 7). This relation is transitive but not reflexiv e. The
indifference relation ∼ is then defined as the symm etric part of , and is easily
checkedtobeanequivalencerelationonX. For any x ∈ X, the equivale n ce c las s [x]
∼
is called in this context the indifference class of x, and is simp ly a generalizat io n of
the familiar concept of “the indifference curve that passes thro ug h x.”Inparticular,
11
Proposition 1 says that n o t wo distinct in difference sets can hav e a point in common.
(This is the gist of the fact that “ distinct indifference cu rves can not cro ss!”)
In social choice th eory, one often wor ks with multiple (complete) preference rela-
tions on a given alternative set X. Fo r instance, suppose that there are n individuals
in the population, and
i
stands for the preference relation of the ith individua l.
The P areto dominance relation on X is defined as x y iff x
i
y for ea ch
i =1, , n. This relation is a preorde r on X in gene ral, and a partial orde r on X if
each
i
is an tisymm etric.
Let (X, ) be a preordered set. By an extension of we understand a preorder
on X such that ⊆ and ⊆|, where | is the asymmetric part of .Intuitively
speaking, an extension of a preorder is “m ore comp lete” than th e original relation in
the sense that it allo ws on e to compare more elements, but it certainly agrees exactly
with the original relation when the latter applies. If is a partial ord er, then it is
an extensio n of iff ⊆ .(Why?)
A fundamen tal result of order theory sa ys that ev ery partial order can be extended
to a linear o r der, that is, for every poset (X, ) there is a loset (X, ) with ⊆
.While it is possible to prove this by mathematical induction when X is finite, the
proof in the general case is built on a relativ ely adva nced method which we will cover
later in the cou rse. Relegating its proof to that Section 1.7, we on ly state h ere th e
result for future reference.
9
Sziplrajn’s Theorem. Every part ial or der on a nonempty set X can be extended
to a linear order on X.
A na tu ral q uestion is if the sam e r esult ho lds fo r preorders as well. The answer
is y es, and the proof follows easily from Sziplrajn’s Theorem b y means of a standard
method.
Corollary 1. Let (X, ) be a preordered set. There exists a complete preorder on
X that extends .
Proof. Let ∼ denote the s ym metric part of , which is an equiva lence relation.
Then (X/
∼
,
∗
) is a poset where
∗
is defined on X/
∼
by
[x]
∼
∗
[y]
∼
if and only if x y.
By S zip lrajn’s T heore m, there ex ists a linear o rde r
∗
on X/
∼
suc h that
∗
⊆
∗
.
We define on X by
x y if and only if [x]
∼
∗
[y]
∼
.
9
For an extensive introduction to the theor y of linear extensions of posets, see Bonnet and Pouzet
(1982).
12
It is easily checked that is a complete p reorde r on X with ⊆ an d ⊆|,
where and | are the asymmetric parts of and , respectively.
Exercise 10.Let(X, ) be a preordered set, and define L() as the set of all complete
preorders that extend
.Provethat =
W
L(). (Where do you use Sziplrajn’s
Theorem in the argument?)
Exercise 11.Let
(X, ) be a finite preordered set. Taking L() as in the previous
exercise, we define
dim(X, ) as the smallest positive integer k such that =
R
1
∩ ···∩ R
k
for some R
i
∈ L(),i=1, , k.
(a) Show that dim(X, ) ≤ |X
2
| .
(b)Whatisdim(X, D
X
)? What is dim(X,X
2
)?
(c) For any positive integer n, show that dim(A
n
(X
i
,
i
)) = n, where (X
i
,
i
) is a
loset with
|X
i
| ≥ 2 for each i =1, , n.
(d) Prove or disprove: dim(2
X
, ⊇)=|X| .
Dhilqlwlrq. Let (X, ) be a preordered set and ∅ = Y ⊆ X. An element x of Y is
said t o be -maximal in Y if there is no y ∈ Y with y x, an d -m in im a l in Y if
there is no y ∈ Y with x y. If x y for all y ∈ Y, then x is called the -maximu m
of Y, and if y x for all y ∈ Y, then x is called the -min imum of Y.
Obviously, for any preordered set (X, ),every-maximum of a non empty subset
of X is -max ima l in that set. Also note that if (X, ) is a poset, then there can be
at most one -maximum of an y Y ∈ 2
X
\{∅}.
E{dpsoh 4. [1] Let X beanynonemptyset,and∅ = Y ⊆ X. Every element of Y
is b oth D
X
-maxim al and D
X
-minimal in Y. Unless it is a singleton, Y has neither a
D
X
-maximum nor a D
X
-minimum element. On the other hand, every elemen t of Y
is both X
2
-ma ximum and X
2
-minimum of Y.
[2] G iv en any nonempt y set X, consider the poset (2
X
, ⊇), and take any nonempt y
A ⊆ 2
X
. The class A has a ⊇-maximum iff
V
A ∈ A, and it has a ⊇-minimum iff
W
A ∈ A. In particular, the ⊇-maximum of 2
X
is X and the ⊇-minimum of 2
X
is ∅.
[3] (Ch oice Corresponden ce s) Given a preference relation on an alternative set
X (Example 3) and a nonempt y subset S of X, we define the “set of choices from S”
for an ind iv idu a l w hose prefer e n ce re lation is as th e set of all -maximal elements
in S. That is, denoting this set as C
(S), we have
C
(S):={x ∈ S : y x for no y ∈ S}.
Evidently, if S is a finite set, then C
(S) is nonempt y. (Proof?) Moreover, if S is finite
and is complete, then there exists at least on e -maximum element in S. Finiteness
requiremen t cannot be omitted in this statement, but as w e shall see throughout this
course , there are various ways in which it can be substantially weak e n ed .
13
Exercise 12.(a) Which subsets of the set of positive integers have a ≥-minimum?
Which ones have a
≥-maximum?
(b) If a set in a poset
(X, ) has a unique -maximal elemen t, does that element
have to be a
-maximum of the set?
(c) Whic h subsets of a poset
(X, ) possess an element which is both -maximum
and
-minimum?
(d) Giv e an example of an infinite set in
R
2
whic h contains a unique ≥-maximal
element that is also the unique
≥-minimal elemen t of the set.
Exercise 13.
H
Let beacompleterelationonanonemptysetX, and S anonempty
finite subset of
X. Define
c
(S):={x ∈ S : x y for all y ∈ S}.
(a)Showthatc
(S) = ∅ if is transitiv e.
(b)Wesaythat
is acyclic if there does not exist a positive integer k suc h that
x
1
, , x
k
∈ X and x
1
x
2
··· x
k
x
1
. Show that every transitive relati on is
acyclic, but not con versely.
(c)Showthat
c
(S) = ∅ if is acyclic.
(d)Showthatif
c
(T ) = ∅ for every finite T ∈ 2
X
\{∅}, then must be acyclic.
Exercise 14.
H
Let (X, ) be a poset, and take any Y ∈ 2
X
\{∅} wh ich has a -
maximal element, say
x
∗
. Prove that can be extended to a linear order on X
such that x
∗
is -maximal in Y.
Exercise 15 . Let (X, ) be a poset. For any Y ⊆ X, an element x in X is said
to be an
-upper bound for Y if x y for all y ∈ Y ; a -low e r bound for
Y is defined similarly. The -suprem um of Y, denoted sup
Y, is defined as the
-minimum of the set of all -upp er bounds for Y, that is, sup
Y is an -upper
bound for
Y and has the property that z sup
Y for any -upper b ound z for Y.
The -infimum of Y, denoted as inf
Y, is defined analogously.
(a) Prov e that there can be only one
-suprem um and only one -infimum of any
subset of
X.
(b) Show that x y iff sup
{x, y} = x and inf
{x, y} = y, for any x, y ∈ X.
(c)Showthatif
sup
X ∈ X (that is, if sup
X exists), then inf
∅ =sup
X.
(d)If is the diagonal relation o n X,andx and y are any two distinct members of
X, does sup
{x, y} exist?
(e)If
X := {x, y, z , w} and := {(z, x), (z, y), (w, x), (w, y)}, does sup
{x, y}
exist?
Exercise 16 .
H
Let (X, ) be a poset. If sup
{x, y} and inf
{x, y} exist for all
x, y ∈ X, then we say that (X, ) is a lattice.Ifsup
Y and inf
Y exist for all
Y ∈ 2
X
, then (X, ) is called a complete lattice.
(a) Show that every complete lattice has an upper and a lower bound.
(b)Showthatif
X is finite and (X, ) is a lattice, then (X,) is a complete lattice.
14
(c) Give an example of a lattice which is not complete.
(d)Provethat
(2
X
, ⊇) is a complete lattice.
(e)Let
X be a nonempty subset of 2
X
such that X ∈ X an d
W
A ∈ X for any
(nonempty) class
A ⊆ X . Prov e that (X , ⊇) is a complete lattice.
1.5 Functions
Intuitiv e ly, we th in k of a function as a rule that transforms the objects in a given
set to those of another. While this is not a formal definition — what is a “rule”? —
we may no w use the notion of a b in ary relation to formaliz e the idea. Let X and
Y be any two nonempt y sets. By a function f that maps X in to Y, denoted as
f : X → Y, we mean a relation f ∈ X × Y such that
(i) for every x ∈ X, there exists a y ∈ Y such that xfy,
(ii) for every y, z ∈ Y with xfy and xfz, we have y = z.
Here X is called the domain of f and Y the codomain of f. The range of f is, on
the other hand, defined as
f(X):={y ∈ Y : xfy for some x ∈ X}.
The set of all functions that map X into Y is denoted by Y
X
. For instance, {0, 1}
X
is the set of all functions on X whose values are either 0 or 1, and R
[0,1]
is the set of
all real-valued functions on [0, 1]. The notation f ∈ Y
X
will be used interchangeably
with the expression f : X → Y throughout this course. S imilarly, the term map is
used interchangeably with the term “function.”
Wh ile our definition of a function m a y look at first a bit strange, it is hardly any-
thing other than a set-theoretic form ulation of the concept w e use in daily discourse.
After all, we want a function f that maps X into Y to a ssign each member of X to
amemberofY, right? Our d e finition says simply tha t one can th in k of f simply as a
set of ordered pairs, so “(x, y) ∈ f”means“x is mapped to y by f.”Putdifferently,
all that f “does” is c ompletely identified b y the set {(x, f(x)) ∈ X × Y : x ∈ X},
whic h is what f “is.” The familiar notation f(x)=y (whic h we shall also adopt in
the rest of the exposition) is then nothing but an alternative wa y of expressing xfy.
When f(x)=y, we refer to y as the image (or value) of x under f. Condition (i)
sa ys th at every element in the d om ain X of f has an image under f in the codomain
Y. In turn, condition (ii) states that no element in the domain of f can h ave more
than one image under f.
Som e autho rs adhere to the intuitive definition of a function as a “rule” that
transforms one set into another, and refer to the set of all ordered pairs (x, f(x)) as
the graph of the function. Denoting this set by Gr(f), then, we can write
Gr(f):={(x, f(x)) ∈ X × Y : x ∈ X}.
Accordingtotheformaldefinition of a function , f and Gr(f) are the same thin g. So
long as w e keep this connection in mind, there is no danger in thinking of a function
15
as a “rule” in the intuitive way. In particular, we sa y t hat two functions f and g are
equal if they ha ve the same graph, or equivalen tly, if they have the same domain and
codomain, and f(x)=g(x) for all x ∈ X. In this case, w e simply write f = g.
If its range equals its codomain, that is, if f(X)=Y, then one sa ys that f maps
X onto Y, and refers to it as a surjection (or a s a surjective fun c tion/map). If
f maps d ist in ct points in its domain t o distinct points in its c odomain , that i s , if
x = y imp lies f(x) = f(y) for all x, y ∈ X, then we sa y that f is an injection
(or a one-to-one, or an injective function/m ap). Finally, if f is bo th injective and
surject ive, the n it is called a bijection (or a bije ctive function/map). For instance,
if X := {1, , 10}, then f := {(1 , 2), (2, 3), , (10, 1)} is a bijection in X
X
, while
g ∈ X
X
, defined as g(x):=3for all x ∈ X, is neither an injection nor a surjection.
When considered as a map in ({0} ∪ X)
X
,fis an injection but not a surjection.
Warning. Every injective function can be viewed as a bijection, prov ided that one
views th e codoma in of the function as its range. Indeed, if f : X → Y is an inje ction ,
then the map f : X → Z is a bijection, where Z := f(X). This is usually expr es se d
as sa y in g that f : X → f(X) is a bijection.
Before w e consider some examples, let us note that a comm on way of defining
a particular function in a given context is to describe the domain and codom ain of
that function, and the image of a generic point in the domain. So one would sa y
some th ing like “let f : X → Y be defined by f (x):= ” or “consider the func tion
f ∈ Y
X
define d by f (x):= ”. For example, b y the function f : R → R
+
defined
by f(x):=x
2
, w e mean the surjection that tra nsforms ev ery real nu mber x to the
nonne gative re a l number x
2
. Since the domain of the function is understood from
the express ion f : X → Y (or f ∈ Y
X
), it is redundan t to add the phrase “for all
x ∈ X” after the expression “f(x):= , ” alth ou gh sometimes we may do so fo r
clarit y. Alternatively, wh en the codomain of the function is clear, a phrase lik e “the
map x → f(x) on X” is commonly used. For instance, one m ay refer to the quadratic
function mentioned above unam biguously as “the map t → t
2
on R.”
E{dpsoh 5. In the following examples X and Y stand for arbitrary nonem pty sets.
[1] A constan t function is the one who assigns the same va lue to every elemen t of
its domain, that is, f ∈ Y
X
is con stant iff th ere exists a y ∈ Y such that f(x)=y for
all x ∈ X. (Formally speaking , this con stant function is the set X ×{y}.) Obviously,
f(X)={y} in this case, so a constan t function is not surjective unless its codomain
is a singleton, and it is not injective unless its domain is a singleton.
[2] A function whose domain and codomain are iden tical, that is, a function in
X
X
, is called a self-map on X. An importan t e xample of a self-map is the identity
function on X. This function is denoted as id
X
, and it is defined as id
X
(x):=x for
16
all x ∈ X. Ob viously, id
X
is a bijection, and formally speak in g, it is none other than
the diagonal relatio n D
X
.
[3] Let S ⊆ X. The function that maps X into {0, 1} such that every member of
S is assigned to 1 and all the oth er elemen ts of X are assigned to zero is c alled the
indicator fun ct ion of S in X. This function is denoted as 1
S
(assuming that the
dom ain X is understood from the con text). By definition, we have
1
S
(x):=
1, if x ∈ S
0, if x ∈ X\S
.
You can check that, for ev e r y A, B ⊆ X, we have 1
A∪B
+ 1
A∩B
= 1
A
+ 1
B
and
1
A∩B
= 1
A
1
B
.
Th e following se t of exam ple s poin t to som e common ly us ed m e th ods of obtaining
new functions from a given set of function s.
E{dpsoh 6. In the following examples X, Y, Z, and W stand for arb itrary non em pty
sets.
[1] Let Z ⊆ X ⊆ W, and f ∈ Y
X
. By the restriction of f to Z, denoted as
f|
Z
, we mean the function f|
Z
∈ Y
Z
defined by f|
Z
(z):=f(z). By an ext ension
of f to W, on the oth er hand, we mea n a function f
∗
∈ Y
W
with f
∗
|
X
= f, that is,
f
∗
(x)=f(x) for all x ∈ X. If f is injective, s o must f|
Z
, but sur j ec tivity of f does
not entai l that of f|
Z
. Of course, if f is not injective, f|
Z
may still turn ou t to be
injective (e.g. x → x
2
is not injective on R, but it is so on R
+
).
[2] Sometimes it is possible to extend a given function by co m bining it with
another function. For in stance, w e can combine an y f ∈ Y
X
and g ∈ W
Z
to obtain
the function h : X ∪Z → Y ∪W defined by
h(t):=
f(t), if t ∈ X
g(t), if t ∈ Z
,
provid ed that X ∩ Z = ∅, or X ∩ Z = ∅ and f|
X∩Z
= g|
X∩Z
. Note that this method
of com b inin g func tion s does not work if f (t) = g(t) for some t ∈ X ∩Z. For, in that
case h would not be well-defined as a function. (What w o uld be the image of t under
h?)
[3] Afunctionf ∈ X
X×Y
defined by f(x, y):=x is called the projection from
X ×Y onto X.
10
(The projection from X ×Y on to Y is similarly de fined.) Ob v iously,
f(X × Y )=X, that is, f is necessarily surje ctive. It is not injective unles s Y is a
singleton.
10
Strictly speaking, I should write f((x, y)) instead of f(x, y), but that’s just splitting hairs.
17
[4] Given functions f : X → Z and g : Z → Y, we define the composition
of f and g as the function g ◦ f : X → Y by g ◦ f (x):=g(f(x)). (For easier
reading, we often write (g ◦ f)(x) instea d of g ◦ f (x).)Thisdefinition accords with
the way w e de fined the composition o f two relations (Exercise 5). Indeed, we have
(g ◦f)(x)={(x, y):xfz and zgy for som e z ∈ Z}.
Obvio us ly, id
Z
◦f = f = f◦id
X
. Ev en when X = Y = Z, th e operation of taking
compositions is not c ommutative. For instance, if the self-map s f and g on R are
defined b y f(x):=2and g(x ):=x
2
, respective ly, then (g◦f)(x)=4and (f ◦g)(x)=2
for any real number x. The composition operation is, how ever, a ssociative, that is,
h ◦ (g ◦ f)=(h ◦ g) ◦ f for all f ∈ Y
X
,g∈ Z
Y
and h ∈ W
Z
.
Exercise 17.Let∼ be an equivalence relation on a nonempty set X. Show that the
map
x → [x]
∼
on X (c alled the quotient map) is a surjection on X which is
injective iff
∼ = D
X
.
Exercise 18.
H
(A Factorization Theore m)LetX and Y be tw o nonempty sets. Prove:
For any function
f : X → Y, there exists a nonempty set Z, a surjection g : X → Z
and an injection h : Z → Y such that f = h ◦ g.
Exercise 19.LetX, Y and Z be nonempty sets, and consider any f, g ∈ Y
X
and
u, v ∈ Z
Y
. Pro ve:
(a)If
f is surjective and u ◦ f = v ◦f, then u = v;
(b)Ifu is injective and u ◦ f = u ◦ g, then f = g;
(c)Iff and u are injective (respectively, surjective), then so is u ◦ f.
Exercise 20.
H
Show that there is no surjection of the form f : X → 2
X
for an y
nonempty set
X.
For an y giv en nonempty sets X and Y, the (direct) imag e of a set A ⊆ X under
f ∈ Y
X
, denoted f(A), is de fined as the collection of all elemen ts y in Y with y = f(x)
for some x ∈ A. That is,
f(A):={f(x):x ∈ A}.
The range of f is thu s the image of its en tire doma in: f(X)={f(x):x ∈ X}. (Note.
If f(A)=B, then one says that “f maps A onto B.”)
The inverse ima ge of a set B in Y, deno te d as f
−1
(B), is d e fined as the set o f
all x in X whose images under f belong to B, that is,
f
−1
(B):={x ∈ X : f(x) ∈ B}.
By con ven tio n, we write f
−1
(y) for f
−1
({y}), that is,
f
−1
(y):={x ∈ X : f(x)=y} for an y y ∈ Y.
Obviously, f
−1
(y) is a singleton for each y ∈ Y iff f is an injection. For instance, if
f stands for the map t → t
2
on R,thenf
−1
(1) = {−1, 1} whereas f|
−1
R
+
(1) = {1}.
18
The issue of whether or not one can express the image (or the inv erse image) of a
union/inte rsect io n of a collection of sets as the union/intersection of the image s (in-
v er se images) of each set in the collection arises quite often in mathem atical analysis.
The follow in g exercise summar izes the situatio n in this regard.
Exercise 21 . Let X and Y be nonempty sets and f ∈ Y
X
. Prove that, for any
(nonempty) classes
A ⊆ 2
X
and B ⊆ 2
Y
, we hav e
f (
V
A)=
V
{f(A):A ∈ A} and f (
W
A) ⊆
W
{f(A):A ∈ A},
whereas
f
−1
(
V
B)=
V
{f
−1
(B):B ∈ B} and f
−1
(
W
B)=
W
{f
−1
(B):B ∈ B}.
A general rule that surfaces from th is ex ercise is that invers e images are quite
well-behaved with res pect to the operations of taking u nion s an d inters ections , while
the same cannot be said for direct images in the case of taking intersections. Indeed,
for any f ∈ Y
X
, we have f(A∩B) ⊇ f(A)∩f(B) for all A, B ⊆ X if, an d only if, f is
injectiv e.
11
The “if” part of this assertion is trivial. The “only if” part follows from
the observation that, if the claim was not true, then, for any distinct x, y ∈ X with
f(x)=f (y), we w ould find ∅ = f(∅)=f({x} ∩ {y})=f({x}) ∩ f({y})={f(x)},
which is absurd.
Finally, w e turn to the problem of inverting a funct ion. For any function f ∈ Y
X
,
let us define the set
f
−1
:= {(y, x) ∈ Y × X : xfy}
whic h is none other th an the inv er se of f viewed as a relation (Exercise 5). This
relation simp ly reverses the map f in the sense that if x is mapped to y by f, then
f
−1
maps y back to x. Now f
−1
ma y or m a y not be a function. If it is, we sa y that
f is invertible and f
−1
is the inve rse of f. Fo r instance, f : R → R
+
define d by
f(t):=t
2
is not in vertible (since (1, 1) ∈ f
−1
and (1, −1) ∈ f
−1
, that is, 1 does not
have a unique image under f
−1
), whereas f|
R
+
is inv er tible and f|
−1
R
+
(t)=
√
t for all
t ∈ R.
The following result give s a simple c h arac terization of invertible function s.
Proposition 2. Let X and Y be two nonempt y sets. A function f ∈ Y
X
is invertible
if, and only if, it is a bijection.
Exercise 22 . Prove Proposition 2.
11
Of course, this does not mean that f(A ∩B)=f(A) ∩f(B) can never hold for a function that
is not one-to-one. It only means that, for any suc h function f, we can alw ays find nonempty s ets A
and B in the domain of f such that f(A ∩B) ⊇ f(A) ∩f(B) is false.
19
By using the composition operation d efinedinExample6.[4], we can giv e another
useful characterization of in vertible functions.
Proposition 3. Let X and Y be two nonempt y sets. A function f ∈ Y
X
is invertible
if, and only if, there exists a function g ∈ X
Y
such that g ◦f = id
X
and f ◦g = id
Y
.
Proof. Th e “only if” part is r ead ily obtained upon choosin g g := f
−1
. To prove
the “if” p art, suppose there exists a g ∈ X
Y
with g ◦ f = id
X
and f ◦ g = id
Y
, and
note that, by Proposition 2, it is enough to sho w that f is a bijection. To verify the
injectivity of f, pic k any x, y ∈ X with f(x)=f(y), and observ e that
x = id
X
(x)=(g ◦ f)(x)=g(f(x)) = g(f(y)) = (g ◦f )(y)=id
X
(y)=y.
To see the surjectivity of f, take any y ∈ Y and define x := g(y). Then we have
f(x)=f(g(y)) = (f ◦ g)(y)=id
Y
(y)=y,
whic h pro ves Y ⊆ f(X). Since the conv e rse co ntainment is trivial, we are done.
1.6 Sequences, Vectors and Matrices
By a s equ ence in a g iven none m pt y set X, we intuitiv ely mean an ordered array of the
form (x
1
,x
2
, ) where each term x
i
of the sequence is a member of X. (Throughou t
this text w e denote such a sequence b y (x
m
), but note that some books prefer instead
the notatio n (x
m
)
∞
m=1
.) As in the ca se of orde re d pairs , one could introduce the no tion
of a sequence a s a new object to our set theory, but again there is really no need to
do so. In tuitively, we understand from the notation (x
1
,x
2
, ) that the ith term in
the array is x
i
. But the n we ca n think of this arra y as a function that m ap s the s et
N of positive integers into X in the sense th at it tells us that “the ith term in the
arra y is x
i
” by mapping i to x
i
. With this definition, our intuitive understanding of the
ordered array (x
1
,x
2
, ) is formally captured b y the function {(i, x
i
):i =1, 2 } = f.
Thus, we define a sequence in a nonempty set X as any function f : N → X, and
represent this function as (x
1
,x
2
, ) where x
i
:= f(i) for eac h i ∈ N. Consequently,
the set of all sequences in X is equal to X
N
. As is common, however, we denote this
set as X
∞
throughout the text.
By a subsequenc e of a sequence (x
m
) ∈ X
∞
, we mean a sequence that is m ade
up of the term s of (x
m
) whic h appear in the subsequence in the same order they
appea r in (x
m
). That is, a subsequence of (x
m
) is of the form (x
m
1
,x
m
2
, ) where
(m
k
) is a sequence in N such that m
1
<m
2
< ···. (We denote this subsequence as
(x
m
k
).) Once again, w e use the no tion of function to form alize th is definition . St r ictly
speaking, a subsequence of a sequence f ∈ X
N
is a function of the for m f ◦σ where
σ : N → N is strictly increasing (that is, σ(k ) < σ(l) for any k, l ∈ N with k<l).
We represent this fun ction as the a rray (x
m
1
,x
m
2
, ) with the understanding that
20
m
k
= σ(k) and x
m
k
= f(m
k
) for each k =1, 2, For instance, (x
m
k
):=(1,
1
3
,
1
5
, )
is a subsequence of (x
m
):=(
1
m
) ∈ R
∞
. Here (x
m
) is a rep resentation for the fun ction
f ∈ R
N
whic h is d efined by f(i):=
1
i
, and (x
m
k
) is a re p res e nta tion of the m a p f ◦σ,
where σ(k):=2k − 1 for eac h k ∈ N.
By a double sequenc e in X, we mean an infinite matrix eac h term of which is
amemberofX. Formally, a double sequence is a function f ∈ X
N×N
. As in the
case of sequences, we represent this function a s (x
kl
) with the understanding that
x
kl
:= f(k, l). The set of all double sequences in X equals X
N×N
, but it is custom ary
to denote this set as X
∞×∞
. We n ote that one can alwa ys view (in more than one
way) a double sequence in X as a sequence of sequences in X, that is, as a sequence
in X
∞
. For instance, we can think of (x
kl
) as ((x
1l
), (x
2l
), ) or as ((x
k1
), (x
k2
), ).
Th e basic idea of viewin g a string of objects as a particular function also applies to
finite strings, of course. For instance, how about X
{1, ,n}
where X is a nonempty set
and n som e positiv e integer? The preceding discussion sho w s that this function space
is none other than the set {(x
1
, , x
n
):x
i
∈ X, i =1, , n}. Th us w e may define
an n-v ector in X as a function f : {1, , n} → X,andrepresent this functio n as
(x
1
, , x
n
) where x
i
:= f(i) for e ach i =1, , n. (Check th at (x
1
, , x
n
)=(x
1
, , x
n
)
iff x
i
= x
i
for e ach i =1, , n, so ev erything is in concert w ith the wa y we defined
n-v e ctors in Section 1.2.) Th e n-fold product of X is then defined as X
{1, ,n}
, but
is denoted as X
n
. (So R
n
= R
{1, ,n}
. This m a kes sen se , no?) The m ain lesson is that
everything that is said about arbitrary functions apply also to sequences and vectors.
Finally, for any positive integers m and n, by an m × n matrix (read “m by n
matrix”) in a nonempty set X, we mean a function f : {1 , , m }×{1, , n} → X.
We represen t this function as [a
ij
]
m×n
with the understanding that a
ij
:= f(i, j ) fo r
each i =1, ,m and j =1, , n. (As you know, one often views a ma trix like [a
ij
]
m×n
as a rectangular array with m rows and n columns, wh ere a
ij
appears in the ith ro w
and jth colum n of this array.) Often w e den ote [a
ij
]
m×n
simply by A, and for any
x ∈ R
n
, write Ax for the “product” of A and x, that is, Ax is th e m-vecto r defined
as
Ax := (a
11
x
1
+ ···+ a
1n
x
n
, , a
m1
x
1
+ ···+ a
mn
x
n
).
The set of all m × n matrices in X is X
{1, ,m}×{1, ,n}
, but it is much better to
denote this set a s X
m×n
. Needle ss to sa y, both X
1×n
and X
n×1
can be iden tified with
X
n
. (Wait, what does this mean?)
1.7 A Glimpse of Adva nced Set Theory: The Axiom of Choice
We now turn to a problem that w e hav e so far conveniently avoided: H ow do we
define the Cartesian product of infinitely many nonempty sets? Intuitively speaking ,
the Cartesian product of all members of a class A of sets is the set of all collec tion s
each of which contains one and only one elemen t of eac h mem ber of A. That is, a
member of this product is really a function on A th at selects a single element f rom
each set in A. The question is simple to state: Does there exist such a function?
21
If |A| < ∞, then th e answer would obviously be yes, because w e can construct
such a function by choosing an element from each set in A one by one. But when A
contain s infinitely many sets, then this method does not readily w ork, so w e need to
prove that such a fun ction exists.
To get a sense of this, suppose A := {A
1
,A
2
, }, where ∅ = A
i
⊆ N for each
i =1, 2, Thenwe’reokay.Wecandefine f : A →
V
A by f(A):=the smallest
element of A —thiswelldefines f a s a map that selects one element from each mem ber
of A simultaneously.Or,ifeachA
i
is a bounded interv a l in R, then again we ’re fine.
This time we can define f, say, as follows: f(A):=the m id point of A. Butwhatifall
we knew was that each A
i
consists of real nu mbers? Or worse, what if we were not
told anything about the contents of A? You see, in general, w e can’t write down a
formula, or an algorithm, the application of which yields such a function. Then how
do yo u kno w that such a thing exists in the first place?
12
In fact, it turns out th at the p roblem of “finding an f : A →
V
A for any giv e n
class A of sets” cannot be settled in one w a y or another by means of the standard
axioms of set theory.
13
The status of our question is thus a bit odd, it is undecidable.
To mak e things a bit more precise, let us state formally the property tha t we are
after.
The Axiom of Choice. For any (nonempty) class A of sets, th ere exists a function
f : A →
V
A such that f(A) ∈ A for each A ∈ A.
One can reword this in a few other wa y s.
12
But, how about the following algorithm? Start with A
1
, and pick any a
1
in A
1
. No w move to
A
2
and pic k any a
2
∈ A
2
. Continue this way, and define g : A →
V
A by g(A
i
)=a
i
,i=1, 2,
Aren’t we done? No, w e are not! The function at hand is not well-defined — its definition does not
tell me exactly which member o f A
27
is assigned to g(A
27
) — this is very much unlike how I defined
f above in the case where each A
i
was contained in N (or was a bounded interval).
Perhaps you are still not quite comfortable about this. You might think that f is well-defined,
its just that it is defined recursively. Let me try to illustrate the problem by means of a concrete
example. Take any infinite set S, and ask y ourself if you can define an injection f from N into S.
Sure, you might say, “recursion” is again the name of the game. Let f(1) be any member a
1
of S.
Then let f(2) be any member of S\{a
1
},f(3) any member S\{a
1
,a
2
}, and so on. Since S\T = ∅
for any finite T ⊂ S, this well-defines f, recursively, as an injection from N into S. Wrong! If this
was the case, on the basis of the kno wledge of f(1), , f(26), I would know the value of f at 27.
The “definition” of f doesn’t do that — it just points to some arbitrary mem ber of A
27
—soitisnot
aproperdefinition at all.
(Note. As “ob vious” as it might seem, the pr oposition “for any infinite set S, there is an injection
in S
N
, ” cannot be prove d within the standard realm of set theory.)
13
For brevity, I am aga in being imprecise about this standard set of axio ms (which is called the
Zermelo-Fraenkel-Skolem axioms). For the pr esent discussion, nothing will be lost if you just t hink
of these as the formal properties needed to “construct” the set t he or y we outlined intuitively earlier.
It is fair to sa y that these axioms hav e an unproblematic standing in mathematics.
22
Exercise 23. Prove th at the Axiom o f Choice is equivalent to t h e followin g statements.
(a) For any nonempty set
S, there exists a f unction f :2
S
\{∅} → S such that
f(A) ∈ A for each ∅ = A ⊆ S.
(b)(Zermelo’s Postulate)IfA is a (nonempt y) class of sets suc h that A ∩B = ∅ for
each distinct
A, B ∈ A, then there exists a set S such that |S ∩A| =1for every
A ∈ A.
(c) Fo r any nonempt y sets X and Y , and any relation R from X into Y, there is a
function
f : Z → Y with ∅ = Z ⊆ X and f ⊆ R. (That is: Every r elation contains
afunction).
The first thing to note about the Axiom of Choice is that it cannot be dispro v ed
by using the standard axioms of set theory. That is, provided that these axioms are
consistent (that is, no con tradiction ma y be logically deduced from them), adjoining
the Axiom of Choice to these axioms yields again a consistent set of axioms. This
raises the possibility that perha ps the A xiom of Choice c an be d e d uc e d as a “theorem”
from th e standard a xiom s . The second th ing to know a bout the Axiom of Choice
is that th is is false, that is, the A x iom of Choice is not provable from th e standard
axioms of set theory.
14
We are then a t a crossroads. We m ust either reject the valid ity of the A xiom of
Choice an d confine ourselv es to the conclusions that can be reached only on the basis
of the standard axioms of set theory, or alternatively, adjoin the Axiom of Choice to
the standar d axiom s to obtain a ric her set theory that is able to yield certain results
that could n ot have been prov ed w ith in the confines of the standard axioms. Most
analysts follo w th e second rou te. Ho wever, it is fair to say tha t the status of the
Axiom of Choice is i n gener al viewed less appealing t h an the standard axioms, so one
often make s it explicit if this axiom is a pre requisite for a particu lar theorem to be
proved. Given our app lied inte r es ts, we will be ev e n m ore relaxed about this matte r.
As an immediate application of the Axiom of C hoice, w e now define the Cartesian
product of an arbitrary (nonempty ) class A of sets as the set of all f : A →
V
A
with f(A) ∈ A for each A ∈ A. We denote this set by XA, and note that XA = ∅
because of the Axiom of Choice. If A = {A
i
: i ∈ I}, where I is an index set, then
we write X
i∈I
A
i
for XA. Clearly, X
i∈I
A
i
is the set of all maps f : I →
V
{A
i
: i ∈ I}
with f(i) ∈ A
i
for each i ∈ I. It is easily checked that this definition is consistent
with the definition of the Cartesian produ ct of finitely many sets giv e n earlier.
Th ere are a few equiva lent v ersion s of the Axiom of C h oic e that are often times
more conv enient to use in applications than the original statemen t of the axiom. To
state the most widely use d ve rsion , let us first agree on some terminology. For an y
poset (X, ),bya“posetin(X, ), ” we mean a poset like (Y, ∩ Y
2
) with Y ⊆ X,
but w e denote this poset more succinctly as (Y,). Recall that an up pe r bou nd for
such a poset is an element x of X with x y for all y ∈ Y (Exercise 15).
14
These results are of extreme importance for the foundations of the entire fieldofmathematics.
The first one was pro ved by Kurt Gödel in 1939, and the s econd one by P aul Cohen in 1963.
23
Zorn’s Lemm a. If ev ery loset in a given poset has an upper bound, then that poset
m u st have a maximal elemen t.
Wh ile this is a less intuitive statemen t than the Axiom of Choic e (no?), it can in
fact be show n to be equivalen t to th e Ax iom of Choice.
15
(That is, we can deduce
Zorn’s Lemm a from the standard axioms and theAxiomofChoice,andwecanprove
the Axiom of Ch oic e by u sin g the standard axioms and Zorn’s L em ma.) Since we
tak e the Axiom of Ch oice a s “ tru e” in this text, therefore, w e m ust also accept the
va lidity of Zorn’s Lem ma.
We conclude this discussion b y means of two quick applications that illustrate
how Zorn’s L e mma is used in pract ice. We will see so m e other applications in later
chapters.
Let us first pro ve the following fact:
The Hausdorff Maximal P rin cip le . There exists a ⊇-maximal loset in every
poset.
Proof. Let (X, ) be a poset, and
L(X, ):={Z ⊆ X :(Z, ) is a loset} .
(Observe that L(X, ) = ∅ by reflexivity of .) We wish to sho w that there is
a ⊇-maximal elemen t of L(X,). Th is will fo llow from Zorn’s Lemma, if we c an
show that every lo set in the poset (L(X, ), ⊇) has an upper bound, that is, for any
A ⊆ L(X,) such that (A, ⊇) is a loset, there is a m e mber of L(X,) th at contains
A. To establish that this is indeed the case, take any suc h
A, and let Y :=
V
A.Then
is a com plete relation on Y, because, since ⊇ linearly orders A, for any x, y ∈ Y we
must have x, y ∈ A for s ome A ∈ A (why?), and hence, given that (A, ) is a loset,
we have either x y or y x. Therefore, (Y,) is a los et, that is, Y ∈ L(X, ).But
it is obvious that Y ⊇ A for any A ∈ A.
In fact, the Hausdorff Max ima l Princip le is equiva lent to the Axiom of Choice.
Exercise 24. Prove Z orn’s Lemm a assuming the validity of the Hausdorff Maximal
Principle.
As another application of Zorn’s Lemma, we prov e Sziplrajn’s Theorem.
16
Our
proof uses the Hausdorff Max im al Principle, but you now know th at th is is equivalen t
to inv ok ing Zorn’s Lemma or the Axiom of Choice.
ProofofSziplrajn’sTheorem.Let be a partial order on a nonempt y set X. Let
T
X
be the set of all partial ord ers on X that extend . C le arly, (T
X
, ⊇) is a poset,
15
Fo r a proof, see Enderton (1977), pp. 151-153, or Kelley (1955), pp. 32-35.
16
In case yo u are wondering, Sziplrajn’s T heorem is not equivalent to the Axiom of Choice.
24
so by the Hausdor ff Maximal Pr incip le, it has a maximal loset, sa y, (A, ⊇). Define
∗
:=
V
A. Since (A, ⊇) is a los et,
∗
is a partial order on X that extends .(Why?)
∗
is in fact c omplete. To see this, su p pose w e can find some x, y ∈ X with neither
x
∗
y nor y
∗
x. Then the tra ns itive closure of
∗
∪{(x, y)} is a member of T
X
that contains
∗
as a p roper subset (Exercise 8 ). (Why exactly?) This c ontradicts
the fact that (A, ⊇) is a maximal loset within (T
X
, ⊇). (Wh y?) Th us
∗
is a linea r
order, and we are done.
2RealNumbers
This course assumes that the reader has a basic understanding of the real n umbers,
so ou r discussion h e r e will be b r ief a n d duly heu ristic . In p articular, we will not even
attempt to giv e a construction of the set R of real numbers . Instead we will mention
some axioms that R satisfies, and focus on certain properties that R possesses thereof.
Some boo ks on r eal analysis giv e a fuller view of the c on struction o f R, some talk
about it even less than we do. If y o u are really curious about this, it’s best if you
consult on a book that specializes on this sort of a thing. (Tr y, for instance, Chapters
4 and 5 of Enderton (1977).)
2.1 Ordered Fields
In this subsection we talk briefly about a few topics in abstract algebra that will
facilitate our discussion of real numbers.
Dhilqlwlrq. Let X be any nonempty set. We refer to a function of the form • :
X × X → X as a binary operation on X,andwritex • y instead of •(x, y) for any
x, y ∈ X.
For instance, the usual addition and multiplication operations + and · are binary
operations on the set N of natural numbers. The subtraction o peration is, on the
other hand, not a binary operation on N (e.g. 1+(−2) /∈ N), but it is a binary
operationonthesetofallintegers.
Dhilqlwlrq. Let X be any nonem pty set, let + and · be two binary operations on
X, and let us agree to write xy for x · y for simplicity. T h e list (X, +, ·) is called a
field if the following properties are satisfied:
(i) (Commu tativity) x + y = y + x and xy = yx for all x, y ∈ X;
(ii) (Associativity) (x + y)+z = x +(y + z) and (xy)z = x(yz) for all x, y, z ∈ X;
17
17
Throughout this exposition, (w) isthesamethingasw, for any w ∈ X. For instance, (x + y)
corresponds to x + y,and(−x) corresponds to −x. The brack ets are used at times only for clarity.
25