Tải bản đầy đủ (.pdf) (295 trang)

A practical approach to linear algebra

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.53 MB, 295 trang )


www.pdfgrip.com

A Practical Approach to

LINEAR ALGEBRA


www.pdfgrip.com

"This page is Intentionally Left Blank"


www.pdfgrip.com

A Practical Approach to

LINEAR ALGEBRA

Prabhat Choudhary

'Oxford Book Company
Jaipur. India


www.pdfgrip.com

ISBN: 978-81-89473-95-2

First Edition 2009


Oxford Book Company
267, 10-B-Scheme, Opp. Narayan Niwas,
Gopalpura By Pass Road, Jaipur-3020 18
Phone: 0141-2594705, Fax: 0141-2597527
e-mail:
website: www.oxfordbookcompany.com

© Reserved

Typeset by:
Shivangi Computers
267, lO-B-Scheme, Opp. Narayan Niwas,
Gopalpura By Pass Road, Jaipur-3020 18

Printed at :
Rajdhani Printers, Delhi

All Rights are Reserved. No part ofthis publication may be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic. mechanical, photocopying, recording, scanning or otherwise, without the prior written
permission of the copyright owner. Responsibility for the facts stated, opinions expressed, conclusions reached and
plagiarism, if any, in this volume is entirely that of the Author, according to whom the matter encompassed in this book has
been origmally created/edited and resemblance with any such publication may be incidental. The Publisher bears no
responsibility for them, whatsoever.


www.pdfgrip.com

Preface
Linear Algebra has occupied a very crucial place in Mathematics. Linear Algebra is a
continuation of classical course in the light of the modem development in Science and

Mathematics. We must emphasize that mathematics is not a spectator sport, and that in
order to understand and appreciate mathematics it is necessary to do a great deal of personal
cogitation and problem solving.
Scientific and engineering research is becoming increasingly dependent upon the
development and implementation of efficient parallel algorithms. Linear algebra is an
indispensable tool in such research and this paper attempts to collect and describe a selection
of some of its more important parallel algorithms. The purpose is to review the current
status and to provide an overall perspective of parallel algorithms for solving dense, banded,
or block-structured problems arising in the major areas of direct solution of linear systems,
least squares computations, eigenvalue and singular value computations, and rapid elliptic
solvers. There is a widespread feeling that the non-linear world is very different, and it is
usually studied as a sophisticated phenomenon of Interpolation between different
approximately-Linear Regimes.
Prabhat Choudhary


www.pdfgrip.com

"This page is Intentionally Left Blank"


www.pdfgrip.com

Contents
v

Preface

1. Basic Notions


1

.2:' Systems of Linear Equations

26

,3. Matrics',

50

4. Determinants

101

5. Introduction to Spectral Theory

139

6. Inner Product Spaces

162

7. Structure of Operators in Inner Product Spaces

198

Quadratic Forms
8. Bilinear and
:


..

221

9. Advanced Spectral Theory

234

10. Linear Transformations

252


www.pdfgrip.com

"This page is Intentionally Left Blank"


www.pdfgrip.com

Chapter 1

Basic Notions

VECTOR SPACES
A vector space V is a collection of objects, called vectors, along with two operations,
addition of vectors and multiplication by a number (scalar), such that the following
properties (the so-called axioms of a vector space) hold:
The first four properties deal with the addition of vector:
1. Commutativity: v + w = w + v for all v, W E V.

2. Associativity: (u + v) + W = u + (v + w) for all u, v, W E V.
3. Zero vector: there exists a special vector, denoted by 0 such that v + 0 = v for
all v E V.
4. Additive inverse: For every vector v E V there exists a vector
v + W = O. Such additive inverse is usually denoted as -v.

W E

V such that

The next two properties concern multiplication:
5. Multiplicative identity: 1v = v for all v E V.
6. Multiplicative associativity: (a~)v = a(~v) for all v E Vand all E scalars a, ~.
And finally, two distributive properties, which connect multiplication and addition:
7. a(u + v) = au + av for all u, v E Vand all sCfllars a.
8. (a + ~)v = av + ~v for all v E Vand all scalars a, ~.
Remark: The above properties seem hard to memorize, but it is not necessary. They
are simply the familiar rules of algebraic manipulations with numbers.
The only new twist here is that you have to understand what operations you can apply
to what objects. You can add vectors, and you can multiply a vector by a number (scalar).
Of course, you can do with number all possible manipulations that you have learned before.
But, you cannot multiply two vectors, or add a number to a vector.
Remark: It is not hard to show that zero vector 0 is unique. It is also easy to show that


www.pdfgrip.com

2

Basic Notions


given v E V the inverse vector -v is unique. In fact, properties can be deduced from the
properties: they imply that 0 = Ov for any v E V, and that -v = (-l)v.
If the scalars are the usual real numbers, we call the space Va real vector space. If the
scalars are the complex numbers, i.e., if we can multiply vectors by complex numbers, we
call the space Va complex vector space.
Note, that any complex vector space is a real vector space as well (if we can multiply
by complex numbers, we can multiply by real numbers), but not the other way around.
It is also possible to consider a situation when the scalars are elements of an arbitrary
field IF.
In this case we say that V is a vector space over the field IF. Although many of the
constructions in the book work for general fields, in this text we consider only real and
complex vector spaces, i.e., IF is always either lR or Co
Example: The space lRn consists of all columns of size n,
VI

v2

v=

vn
whose entries are real numbers. Addition and multiplication are defined entrywise, i.e.,

a

aVn

vn

en


vn

wn

vn +wn

Example: The space
also consists of columns of size n, only the entries now are
complex numbers. Addition and multiplication are defined exactly as in the case of lRn ,
the only difference is that we can now multiply vectors by complex numbers, i.e.,
is a
complex vector space.
Example: The space Mmxn (also denoted as Mm n) ofm x n matrices: the multiplication
and addition are defined entrywise.Ifwe allow only real entries (and so only multiplication
only by reals), then we have a real vector space; if we allow complex entries and
multiplication by complex numbers, we then have a complex vector space.
Example: The space lP'n of polynomials of degree at most n, consists of all polynomials
p of form
pet) = ao + alt + a2 +... + ant,
where t is the independent variable. Note, that some, or even all, coefficients ak can be O.
In the case of real coefficients ak we have a real vector space, complex coefficient
give us a complex vector space.

en

r-


www.pdfgrip.com


3

Basic Notions

Question: What are zero vectors in each of the above examples?

Matrix notation
An m x n matrix is a rectangular array with m rows and n columns. Elements of the
array are called entries of the matrix.
It is often convenient to denote matrix entries by indexed letters is}, the first index
denotes the number of the row, where the entry is aij' and the second one is the number of
the column. For example
al,1

- )m n =

A = (aJ ' kj=l,k=1
'

a2,1

is a general way to write an m x n matrix.
Very often for a matrix A the entry in row number) and column number k is denoted
by AJ-,k or (A)J,-k' and sometimes as in example above the same letter but in lowercase is
used for the matrix entries.
Given a matrix A, its transpose (or transposed matrix) AT, is defined by transforming
the rows of A into the columns. For example

(41 25 63)T = (12 4J5 .

3 6
So, the columns of AT are the rows of A and vise versa, the rows of AT are the columns
ofA.
The formal definition is as follows: (AT)j,k = (A)kj meaning that the entry of AT in the
row number) and column number k equals the entry of A in the row number k and row
number}.
The transpose of a matrix has a very nice interpretation in terms of linear
transformations, namely it gives the so called adjoint transformation. We will study this
in detail later, but for now transposition will be just a useful formal operation.
One of the first uses of the transpose is that we can write a column vector x E IRn as
x = (XI' x 2' •.. , xn)T. Ifwe put the column vertically, it will use significantly more space.

LINEAR COMBINATIONS, BASES.
Let Vbe a vector space, and let VI' v2"'" vp E Vbe a collection of vectors. A linear
combination of vectors VI' v 2"'" vp is a sum of form


www.pdfgrip.com

4

Basic Notions
p

aiv i

+ a 2v2 +... + apvp = Lakvk.
k=1

Definition: A system of vectors vI' v2, ... vn E Vis called a basis (for the vector space

V) if any vector v E V admits a unique representation as a linear combination
II

V

= aiv i + a 2v2 +... + anvn = Lak vk •
k=1

The coefficients ai' a 2, ••• , an are called coordinates of the vector v (in the basis, or
with respect to the basis vI' v2' ••. , v,J
Another way to say that vI' v2'.·., VII is a basis is to say that the equation xlvI + x 2v2
+... + xmvn = v (with unknowns xk ) has a unique solution for arbitrary right side v.
Before discussing any properties of bases, let us give few examples, showing that
such objects exist, and it makes sense to study them.
Example: The space V is ]RII. Consider vectors

0

e,

=

0 , e2
0

=

0

0


0

1

0

0

0 , e3
0

=

, ... , en

=

0 ,

0

(the vector ek has all entries 0 except the entry number k, which is 1). The system of
vectors e l , e 2 , ... , ell is a basis in Rn. Indeed, any vector

V=

can be represented as the linear combination

Xn

n

V

=

xle l + x 2e2 +... xnen = LXkek
k=1

and this representation is unique. The system e l , e2, ... , en E ]Rn is called the standard basis
in ]Rn.
Example: In this example the space is the space Jllln of the polynomials of degree at
most n. Consider vectors (polynomials) eo' e l , e2,... , en E Jllln defined by
eo_= 1, e 1 = t, e2 = P, e3 =~, ... , en =~.


www.pdfgrip.com

Basic Notions

5

Clearly, any polynomial p, pet)
representation

= ao + alt + a 2t 2 +

+ ain admits a unique

p = aoe o + aiel +... + anen·

So the system eo' e l , e 2, •.. , en E pn is a basis in pn. We will call it the standard basis
in pn.
Remark: If a vector space V has a basis vI' v2, •.. , vn' then any vector v is uniquely
defined by its co-effcients in the decomposition v =

I. k=1 Uk vk .

So, if we stack the coefficients uk in a column, we can operate with them as if they
were column vectors, i.e., as with elements oflRn.
Namely, if v

=

I. k=1 Uk vk

n
v+w= I.UkVk+
k=1

and w

=

I. k=1 ~k vk ' then

n

n

k=!


k=1

I.~kVk= L(Uk+~k)Vk>

i.e., to get the column of coordinates of the sum one just need to add the columns of
coordinates of the summands.
Generating and Linearly Independent Systems. The definition of a basis says that any
vector admits a unique representation as a linear combination. This statement is in fact
two statements, namely that the representation exists and that it is unique. Let us analyse
these two statements separately.
Definition: A system of vectors vI' '.'2' ... ' Vp E Vis called a generating system (also a
spanning system, or a complete system) in V if any vector v E V admits representation as
a linear combination
p

v

= Uiv i + U 2v 2 +... + upvp =

I.UkVk
k=1

The only difference with the definition of a basis is that we do not assume that the
representation above is unique. The words generating, spanning and complete here are
synonyms. The term complete, because of my operator theory background.
Clearly, any basis is a generating (complete) system. Also, if we have a basis, say vI'
v2' ... , vn' and we add to it several vectors, say vn +1' ... , vp ' then the new system will be a
generating (complete) system. Indeed, we can represent any vector as a linear combination
of the vectors vI' v2' ... , vn' and just ignore the new ones (by putting corresponding

coefficients uk = 0).
Now, let us turn our attention to the uniqueness. We do not want to worry about
existence, so let us consider the zero vector 0, which always admits a representation as a
linear combination.


www.pdfgrip.com

6

Basic Notions

Definition: A linear combination a l vI + a 2v2 +... + apvp is called trivial if a k = 0 Vk.
A trivi~llinear combination is always (for all choices of vectors vI' v2' .•. , vp ) equal to
0, and that IS probably the reason for the name.
Definition: A system of vectors vI' v2' ... , vp E V is called linearly independent if only

the trivial linear combination (:2..:=l akvk with a k = 0 Vk) of vectors vI' V2, ... , vp equals

O.
In other words, the system vI' v2, ••• , vp is linearly independent i the equation xlvI +
x 2v2 + ... + xpvp = 0 (with unknowns x k) has only trivial solution xI = x 2 =... = xp = O.
If a system is not linearly independent, it is called linearly dependent. By negating the
definition of linear independence, we get the following
Definition: A system of vectors vI' v2 , ••• , vp is called linearly dependent if 0 can be
represented as a nontrivial linear combination, 0 = :2..:=l akvk .
Non-trivial here means that at least one of the coefficient a k is non-zero. This can be
(and usually is) written as :2..:=1 1ak 1"* o.
So, restating the definition we can say, that a system is linearly dependent if and only
ifthere exist scalars at' a 2, ... ,


(J.P'

:2..:=11 ak 1"* 0 such that
p

:2.. a k vk = o.
k=1
An alternative definition (in terms of equations) is that a system VI' v 2' ••• , vp is linearly
dependent i the equation
XlVI +x2v2 +··· +xpvp=O
(with unknowns x k ) has a non-trivial solution. Non-trivial, once again again means that at

least one ofxk is different from 0, and it can be written as :2..:=1 1xk 1"* O.
The following proposition gives an alternative description of linearly dependent
systems.
Proposition: A system of vectors VI' V2, ... , vp E V is linearly dependent if and only if
one of the vectors Vk can be represented as a linear combination of the other vectors,
P

Vk =

:2..~iVj'

j=1
j*k
Proof Suppose the system

ak' :2..:=11 ak 1"* 0 such that


VI'

V2"'" vp is linearly dependent. Then there exist scalars


www.pdfgrip.com

7

Basic Notions

aiv i + a 2v 2 + ... + apvp = O.
Let k be the index such that ak:f:. O. Then, moving all terms except akvk to the right
side we get
p

akvk Iajvj.
j=1

j"#k

-a/a

Dividing both sides by ak we get with ~j =
k•
On the other hand, if holds, 0 can be represented as a non-trivial linear combination
P

vk -


I~jVj

=0

j=1

j"#k

Obviously, any basis is a linearly independent system. Indeed, if a system vI' v 2,···, vn
is a basis, 0 admits a unique representation
n

0= alv 1 + a 2v2 +... + anvn = Lakvk'
k=l

Since the trivial linear combination always gives 0, the trivial linear combination must
be the only one giving O.
So, as we already discussed, if a system is a basis it is a complete (generating) and
linearly independent system. The following proposition shows that the converse implication
is also true.
Proposition: A system of vectors v I' v2' ..• , Vn E V is a basis if and only if it is linearly
independent and complete (generating).
Proof: We already know that a basis is always linearly independent and complete, so
in one direction the proposition is already proved.
Let us prove the other direction. Suppose a system vI' v2' ... , vn is linearly independent
and complete. Take an arbitrary vector v2 v. Since the system vI' v2, ••. , vn is linearly complete
(generating), v can be represented as
n
V


= I'V
U,I V I

+ '""'2V 2 + •.• + '""'n
rv V = ~ akvk'
n
£...J
I'V

k=I

We only need to show that this representation is unique.
Suppose v admits another representation

Then


www.pdfgrip.com

8

Basic Notions
n

n

n

L, (ak -advk = L, (akvk)- L,akVk = V-V= 0
k=l

k=l
k=l

Since the system is linearly independent, Uk - Uk = 0 'r;fk, and thus the representation
v = aIv I + a 2v 2 +... + anvn is unique.
Remark: In many textbooks a basis is defined as a complete and linearly independent
system. Although this definition is more common than one presented in this text. It
emphasizes the main property of a basis, namely that any vector admits a unique
representation as a linear combination.
Proposition: Any (finite) generating system contains a basis.
Proof Suppose VI' v2"'" Vp E V is a generating (complete) set. If it is linearly
independent, it is a basis, and we are done.
Suppose it is not linearly independent, i.e., it is linearly dependent. Then there exists
a vector V k which can be represented as a linear combination of the vectors vj' j :j; k.
Since vk can be represented as a linear combination of vectors vj' j :j; k, any linear
combination of vectors vI' v2"'" vp can be represented as a linear combination of the same
vectors without vk (i.e., the vectors vj' 1 ~j ~p,j = k). So, if we delete the vector vk, the
new system will still be a complete one.
If the new system is linearly independent, we are done. 1fnot, we repeat the procedure.
Repeating this procedure finitely many times we arrive to a linearly independent and
complete system, because otherwise we delete all vectors and end up with an empty set.
So, any finite complete (generating) set contains a complete linearly independent subset,
i.e., a basis.

LINEAR TRANSFORMATIONS. MATRIX-VECTOR MULTIPLICATION
A transformation T from a set X to a set Y is a rule that for each argument (input) x E
X assigns a value (output) y = T (x) E Y. The set X is called the domain of T, and the set
Y is called the target space or codomain of T. We write T: X ~ Y to say that T is a
transformation with the domain X and the target space Y.
Definition: Let V, W be vector spaces. A transformation T: V ~ W is called linear if

I. T (u + v) = T(u) + T (v) 'r;fu, v E V;
2. T (av) = aT (v) for all v E Vand for all scalars a.
Properties I and 2 together are equivalent to the following one:
T (au + pv) = aT (u) + PT (v) for all u, v E Vand for all scalars a, b.
Examples: You dealt with linear transformation before, may be without even suspecting
it, as the examples below show.
Example: Differentiation: Let V = lfDn (the set of polynomials of degree at most n), W
= lP'n-l' and let T: lfDn ~ lfD,._l be the differentiation operator,


www.pdfgrip.com

Basic Notions

9

T (p):= p'lip E lP'n'
Since if + g) = f + g and (a./)' = af', this is a linear transformation.
Example: Rotation: in this example V = W = jR2 (the usual coordinate plane), and a
transformation Ty: jR2 -7 jR2 takes a vector in jR2 and rotates it counterclockwise by r
radians. Since Tyrotates the plane as a whole, it rotates as a whole the parallelogram used
to define a sum of two vectors (parallelogram law). Therefore the property 1 of linear
transformation holds. It is also easy to see that the property 2 is also true.
Example: Reflection: in this example again V = W = jR2, and the transformation T: jR2 -7 jR2 is the reflection in the first coordinate axis. It can also be shown
geometrically, that this transformation is linear, but we will use another way to show that.

Fig. Rotation

Namely, it is easy to write a formula for T,
T ((::))


~ V~J

and from this formula it is easy to check that the transformation is linear.
Example: Let us investigate linear transformations T: jR ~ lR. Any such transformation
is given by the formula
T (x) = ax where a = T (1).
Indeed,
T(x) = T(x x I) =xT(l) =xa = ax.
So, any linear transformation of jR is just a multiplication by a constant.
Linear transformations J!{' -7
Matrix-column mUltiplication: It turns out that a
linear transformation T: jRn -7 jRm also can be represented as a multiplication, not by a
number, but by a matrix.

r.


www.pdfgrip.com

10

Basic Notions

Let us see how. Let T: ]Rn ~ ]Rm be a linear transformation. What information do we
need to compute T (x) for all vectors x E ]Rn? My claim is that it is sufficient how T acts on
the standard basis e" e 2,... , en of Rn. Namely, it is sufficient to know n vectors in Rm (i.e."
the vectors of size m),
Indeed, let


X=
Xn
Then x = xle l + x 2e2 + ... + xnen = L:~=lxkek and
T(x)

= T(ixkek) = iT(Xkek) = iXkT(ek) = iXkak .

k=l
k=l
k=l
k=l
So, if we join the vectors (columns) aI' a2, ... , an together in a matrix
A = [aI' a2, ... , an]
(ak being the kth column of A, k = 1, 2, ... , n), this matrix contains all the information
about T. Let us show how one should define the product of a matrix and a vector (column)
to represent the transformation T as a product, T (x) = Ax. Let
al,l

al,2

al,n

a2,1

a2,2

a2,n

A=
am,l am,2

am,n
Recall, that the column number k of A is the vector ak , i.e.,
al,k
ak =

a2,k
am,k

Then if we want Ax = T (x) we get
al,l
Ax =

n
LXkak
k=l

= Xl

a2,1
am,l

al,2
+X2

a2,2
am,2

al,n
+···+Xn


a2,n
am,n

So, the matrix-vect~r multiplication should be performed by the following column by


www.pdfgrip.com

Basic Notions

11

coordinate rule: Multiply each column of the matrix by the corresponding coordinate of
the vector.
Example:

The "column by coordinate" rule is very well adapted for parallel computing. It will
be also very important in different theoretical constructions later.
However, when doing computations manually, it is more convenient to compute the
result one entry at a time. This can be expressed as the following row by column rule:
To get the entry number k of the result, one need to multiply row number k of the
matrix by the vector, that is, if Ax = y, then
yk =

I

n

a ·x·
= 1,2, ... m,.

j=lk,}},k

here Xj and Yk are coordinates ofthe vectors x and y respectively, and aj'k are the entries of
the matrix A.
Example:

3)(~J3 = (1.1+2.2+3.3)=(14)
4·1 + 5·2 + 6·3
32

( 41 25 6

Linear transformations and generating sets: As we discussed above, linear
transformation T(acting from ~nto ~m) is completely defined by its values on the standard
basis in ~n. The fact that we consider the standard basis is not essential, one can consider
any basis, even any generating (spanning) set. Namely, a linear transformation T: V ---? W
is completely defined by its values on a generating set (in particular by its values on a
basis). In particular, if vI' V 2, ••. , vn is a generating set (in particular, ifit is a basis) in V,
and T and TI are linear transformations T, T~: V ---? W such that

Tvk = TIv", k= 1,2, ... , n
thenT= TI.
Conclusions
1.

To get the matrix of a linear transformation T: Rn ---? Rm one needs to join
the vectors ak = T ek (where e I , e2, ••• , en is the standard basis in Rn) into a
matrix: kth column of the matrix is ak' k = 1,2, ... , n.

2.


If the matrix A of the linear transformation T is known, then T (x) can be
found by the matrix-vector multiplication, T(x) = Ax. To perform matrixvector multiplication one can use either "column by coordinate" or "row by
column" rule.


www.pdfgrip.com

12

Basic Notions

The latter seems more appropriate for manual computations. The former is well adapted
for parallel computers, and will be used in different theoretical constructions.
For a linear transformation T: JR.n ~ JR:m, its matrix is usually denoted as [T]. However,
very often people do not distinguish between a linear transformation and its matrix, and
use the same symbol for both. When it does not lead to confusion, we will also use the
same symbol for a transformation and its matrix.
Since a linear transformation is essentially a multiplication, the notation Tv is often
used instead of T(v). We will also use this notation. Note that the usual order of algebraic
operations apply, i.e., Tv + u means T(v) + u, not T(v + u).
Remark: In the matrix-vector mUltiplication Ax the number of columns of the matrix
A matrix must coincide with the size of the vector x, i.e." a vector in JR.n can only be
multiplied by an m x n matrix. It makes sense, since an m x n matrix defines a linear
transformation JR.n ~ JR. m, so vector x must belong to JR.n.
The easiest way to remember this is to remember that if performing multiplication
you run out of some elements faster, then the multiplication is not defined. For example, if
using the "row by column" rule you run out of row entries, but still have some unused
entries in the vector, the multiplication is not defined. It is also not defined if you run out
of vector's entries, but still have unused entries in the column.


COMPOSITION OF LINEAR TRANSFORMATIONS
AND lVIATRIX MULTIPLICATION
Definition of the matrix multiplication: Knowing matrix-vector multiplication, one
can easily guess what is the natural way to define the product AB of two matrices: Let us
multiply by A each column of B (matrix-vector multiplication) and join the resulting
column-vectors into a matrix. Formally, if b I , b2 , ••. , br are the columns of B, then Ab I ,
Ab2 , ... , Abr are the columns of the matrix AB. Recalling the row by column rule for the
matrix-vector mUltiplication we get the following row by column rule for the matrices the
entry (AB)j,k (the entry in the row j and column k) of the product AB is defined by
(AB)j,k = (row j of A) . (column k of B)
Formally it can be rewritten as

,

(AB)j,k = Laj"b"k'

if aj,k and bj,k are entries of the matrices A and B respectively.
I intentionally did not speak about sizes of the matrices A and B, but if we recall the
row by column rule for the matrix-vector multiplication, we can see that in order for the
multiplication to be defined, the size of a row of A should be equal to the size of a column
of B. In other words the product AB is defined i£ and only if A is an m x nand B is n x r
matrix.


www.pdfgrip.com

Basic Notions

13


Motivation: Composition of linear transformations. Why are we using such a
complicated rule of multiplication? Why don't we just multiply matrices entrywise? And
the answer is, that the multiplication, as it is defined above, arises naturally from the
composition of linear transformations. Suppose we have two linear transformations,
T1: ]Rn ~ ]Rm and T2: ]Rr ~ ]Rn. Define the composition T = TI T2 of the transformations
T I , T2 as

T (x) = TI(Tix)) \Ix ERr.
Note that TI(x) ERn. Since TI:]Rn ~ ]Rm, the expression TI(Tix)) is well defined and
the result belongs to ]Rm. So, T: ]Rr ~ ]Rm.
It is easy to show that T is a linear transformation, so it is defined by an m x r matrix.
How one can find this matrix, knowing the matrices of TI and T2?
Let A be the matrix of TI and B be the matrix of T2. As we discussed in the previous
section, the columns of T are vectors T (e l ), T (e 2), ... , T(e r), where el' e2, ... , er is the
standard basis in Rr. For k = 1, 2, ... , r we have
T (e k) = TI (T2(e k )) = TI(Be k) = TI(b k) = Abk
(operators T2 and TI are simply the mUltiplication by B and A respectively).
So, the columns of the matrix of Tare Abl' Ab2,... , Abr, and that is exactly how
the matrix AB was defined!
Let us return to identifying again a linear transformation with its matrix. Since the
matrix multiplication agrees with the composition, we can (and will) write TI T2 instead of
TI T2 and TIT2x instead of TI(Tix )).
Note that in the composition TI T2 the transformation T2 is applied first! The way to
remember this is to see that in TI T2x the transformation T2 meets x first.
Remark: There is another way of checking the dimensions of matrices in a product,
different form the row by column rule: for a composition T JT2 to be defined it is necessary
that T2x belongs to the domain of T1• If T2 acts from some space, say ]R'to ]Rn, then TI must
act from Rn to some space, say ]Rm. So, in order for TI T2 to be defined the matrices of TI
and T2 should We will usually identify a linear transformation and its matrix, but in the

next few paragraphs we will distinguish them be of sizes m x nand n x r respectively-the
same condition as obtained from the row by column rule.
Example: Let T: ]R2 ~ ]R2 be the reflection in the line xI = 3x2. It is a linear
transformation, so let us find its matrix. To find the matrix, we need to compute Tel and
Te2 . However, the direct computation of Te I and Te2 involves significantly more trigonometry
than a sane person is willing to remember.
An easier way to find the matrix of T is to represent it as a composition of simple
linear transformation. Namely, let g be the angle between the xI axis and the line xI = 3x2,
and let To be the reflection in the xl-axis. Then to get the reflection T we can first rotate
the plane by the angle -g, moving the line xI = 3x2 to the xl-axis, then reflect everything
in the xI-axis, and then rotate the plane by g, taking everything back. Formally it can be
written as


www.pdfgrip.com

14

Basic Notions

T= RgTOR-Y
where Rg is the rotation by g. The matrix of To is easy to compute,

To

~ (~ _~),

the rotation matrices are known
cosy -sin Y)
Ry = ( sin y

cos y ,
COS(-y)

R_y= ( sin(-y)

-sine-y») (COSY
cos(-y) = -siny

sin y)
cosy,

To compute sin yand cos ytake a vector in the line x I = 3x2, say a vector (3,
first coordinate
3
3
length
- ~32 + 12 cos Y=

Il. Then

.JW

and similarly
sin y =

second coordinate
length
-- ~32 + 12 -

1


.J1O

Gathering everything together we get

T~ VoR-y~ lo G~I)(~ ~I) lo (~I ~)
~ I~ G~I)(~ ~I)( ~I ~)
It remains only to perform matrix multiplication here to get the final result.
Properties of Matrix Multiplication.
Matrix multiplication enjoys a lot of properties, familiar to us from high school algebra:
I.
Associativity: A(BC) = (AB)C, provided that either left or right side is well defined;

Distributivity: A(B + C) = AB + AC, (A + B)C = AC + BC, provided either left or
right side of each equation is well defined;
3.
One can take scalar multiplies out: A(aB) = aAB.
This properties are easy to prove. One should prove the corresponding properties for
linear transformations, and they almost trivially follow from the definitions. The properties
of linear transformations then imply the properties for the matrix multiplication.
The new twist here is that the commutativity fails: Matrix multiplication is noncommutative, i.e., generally for matrices AB = BA.
2.


www.pdfgrip.com

Basic Notions

15


One can see easily it would be unreasonable to expect the commutativity of matrix
multiplication. Indeed, letA and B be matrices of sizes m x nand n x r respectively. Then
the product AB is well defined, but if m = r, BA is not defined.
Even when both products are well defined, for example, when A and Bare nxn (square)
matrices, the multiplication is still non-commutative. If we just pick the matrices A and B
at random, the chances are that AB = BA: we have to be very lucky to get AB = BA.
Transposed Matrices and Multiplication.
Given a matrix A, its transpose (or transposed matrix) AT is defined by transforming
the rows of A into the columns. For example

I 2
(4 5

(1 4)

!) ~ ~ ! .
T

So, the columns of AT are the rows of A and vise versa, the rows of AT are the columns
ofA.
The formal definition is as follows: (AT)j,k = (A)kJ meaning that the entry of AT in the
row number) and column number k equals the entry of A in the row number k and row
number}.
The transpose of a matrix has a very nice interpretation in terms of linear
transformations, namely it gives the so-called adjoint transformation.
We will study this in detail later, but for now transposition will be just a useful formal
operation.
One of the first uses of the transpose is that we can write a column vector x E Rn as x
= (x \' x 2, ..• , X-n)T. If we put the column vertically, it will use significantly more space.
A simple analysis of the row by columns rule shows that

(AB)T = BTAT,
i.e." when you take the transpose of the product, you change the order of the terms.
Trace and Matrix Multiplication.
For a square (n x n) matrix A
diagonal entries

= (aj,k) its trace (denoted by trace A) is the sum of the
n

trace A

=

L ak,k

k=l

Theorem: Let A and B be matrices of size m Xn and n Xm respectively (so the both
p )ducts AB and BA are well defined). Then
trace(AB) = trace(BA)


www.pdfgrip.com

16

Basic Notions

There are essentially two ways of proving this theorem. One is to compute the diagonal.
entries of AB and of BA and compare their sums. This method requires some proficiency

in manipulating sums in notation. If you are not comfortable with algebraic manipulatioos,
there is another way. We can consider two linear transformations, T and Tl' acting from
Mnxm to lR = lRI defined by
T (X) = trace(AX), T} (X) = trace(XA)
To prove the theorem it is sufficient to show that T = T 1; the equality for X = A gives
the theorem. Since a linear transformation is completely defined by its values on a generating
system, we need just to check the equality on some simple matrices, for example on matrices
which has all entries 0 except the entry I in the intersection of jth column and kth
row.

J0.k'

INVERTIBLE TRANSFORMATIONS AND MATRICES. ISOMORPHISMS
IDENTITY TRANSFORMATION AND IDENTITY MATRIX
Among all linear transformations, there is a special one, the identity transformation
(operator) L Ix = x, 'Vx. To be precise, there are infinitely many identity transformations:
for any vector space V, there is the identity transformation I = Iv: V ~ V, Ivx = x, 'Vx E
V. However, when it is does not lead to the confusion we will use the same symbol I for all
identity operators (transformations). We will use the notation IV only we want to emphasize
in what space the transformation is acting. Clearly, if I: lR n ~ lR n is the identity
transformation in Rn, its matrix is an n x n matrix

1 0
o 1

0
0

o


1

1=1n =

0

(l on the main diagonal and 0 everywhere else). When we want to emphasize the size
of the matrix, we use the notation In; otherwise we just use 1. Clearly, for an arbitrary
linear transformation A, the equalities
AI=A,IA =A
hold (whenever the product is defined).

INVERTffiLE TRANSFORMATIONS
Definition: Let A: V ~ W be a linear transformation. We say that the transformation
A is left invertible if there exist a transformation B: W ~ V such that
BA = I (I = I v here). The transformation A is called right invertible if there exists a linear
transformation C: W ~ V such that


×