24 INTRODUCTION CHAPTER 1
COLLADA
COLLADA, short for COLLAborative Design Activity,
12
started as an open-source project
led by Sony, but is nowadays being developed and promoted by the Khronos Group.
COLLADA is an interchange format for 3D content; it is the glue which binds together
digital content creation (DCC) tools and various intermediate processing tools to form
a production pipeline. In other words, COLLADA is a tool for content development, not
for content deliver y—the final applications are better served with more compact formats
designed for their particular tasks.
COLLADA can represent pretty much everything in a 3D scene that the content authoring
tools can, including geometry, material and shading properties, physics, and animation,
just to name a few. It also has a mobile profile that corresponds to OpenGL ES 1.x and
M3G 1.x, enabling an easy mapping to the M3G binary file format. One of the latest addi-
tions is COLLADA FX, which allows interchange of complex, multi-pass shader effects.
COLLADA FX allows encapsulation of multiple descriptions of an effect, such as different
levels of detail, or different shading for daytime and nighttime versions.
Exporters for COLLADA are currently available for all major 3D content creation tools,
such as Lightwave, Blender, Maya, Softimage, and 3ds Max. A stand-alone viewer is also
available from Feeling Software. Adobe uses COLLADA as an import format for editing
3D textures, and it has been adopted as a data format for Google Earth and Unreal Engine.
For an in-depth coverage of COLLADA, see the book by Arnaud and Barnes [AB06].
12 www.khronos.org/collada
PART I
ANATOMY OF A GRAPHICS
ENGINE
This page intentionally left blank
2
CHAPTER
LINEAR ALGEBRA FOR 3D
GRAPHICS
This chapter is about the coordinate systems and transformations that 3D objects undergo
during their travel through the graphics pipeline, as illustrated in Figure 2.1. Understand-
ing this subset of linear algebra is crucial for figuring out what goes on inside a 3D graphics
engine, as well as for making effective use of such an engine. If you want to rush ahead into
the graphics primitives instead, study Figure 2.1, skip to Chapter 3, and return here later.
2.1 COORDINATE SYSTEMS
To be able to define shapes and locations, we need to have a frame of reference: a coordinate
system, also known as a space. A coordinate system has an origin and a set of axes. The
origin is a point (or equivalently, a location), while the axes are directions.
As a mathematical construct, a coordinate system may have an arbitrary set of axes with
arbitrary directions, but here we are only concerned about coordinate systems that are
three-dimensional, orthonormal, and right-handed. Such coordinate systems have three
axes, usually called x, y, and z. Each axis is normalized (unit length) and orthogonal
(perpendicular) to the other two. Now, if we first place the x and y axes so that they meet
at the origin at right angles (90
◦
), we have two possibilities to orient the z axis so that it
is perpendicular to both x and y. These choices make the coordinate system either right-
handed or left-handed; Figure 2.2 shows two formulations of the right-handed choice.
27
28 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2
Object
coordinates
eye
coordinates
clip
coordinates
projection
matrix
w
w
2w
2w
near
far
model-
view
matrix
viewport
and
depth range
window
coordinates
21
21
1
1
normalized
device
coordinates
height height
width
0
0
0
0
1
Figure 2.1: Summary of the coordinate system transformations from vertex definition all the way to the frame buffer.
X
Y
Z
X
Y
Z
Figure2.2: Two different ways to visualize a right-handed, orthogonal 3D coordinate system. Left: the thumb, index finger,
and middle finger of the right hand are assigned the axes x, y, and z, in that order. The positive direction of each axis is pointed
to by the corresponding finger.
Right: we grab the z axis with the right hand so that the thumb extends toward the positive
direction; the other fingers then indicate the direction of positive rotation angles on the xy-plane.
SECTION 2.1 COORDINATE SYSTEMS 29
A coordinate system is always defined with respect to some other coordinate system,
except for the global world coordinate system. For example, the coordinate system of a
room might have its origin at the southwest corner, with the x axis pointing east, y point-
ing north, and z upward. A chair in the room might have its own coordinate system, its
origin at its center of mass, and its axes aligned with the chair’s axes of symmetry. When
the chair is moved in the room, its coordinate system moves and may reorient with respect
to the parent coordinate system (that of the room).
2.1.1 VECTORS AND POINTS
A3Dpoint is a location in space, in a 3D coordinate system. We can find a point p with
coordinates
p
x
p
y
p
z
by starting from the origin (at
[
000
]
) and moving the dis-
tance p
x
along the x axis, from there the distance p
y
along y, and finally the distance
p
z
along z.
Two points define a line segment between them, three points define a triangle with corners
at those points, and several interconnected triangles can be used to define the surface
of an object. By placing many such objects into a world coordinate system, we define a
virtual world. Then we only need to position and orient an imaginary camera to define a
viewpoint into the world, and finally let the graphics engine create an image. If we wish
to animate the world, we have to move either the camera or some of the points, or both,
before rendering the next frame.
When we use points to define geometric entities such as triangles, we often call those
points vertices. We may also expand the definition of a vertex to include any other data
that are associated with that surface point, such as a color.
Besides points, we also need vectors to represent surface normals, viewing directions, light
directions, and so on. A vector v is a displacement, a difference of two points; it has no
position, but does have a direction and a length. Similar to points, vectors can be repre-
sented by three coordinates. The vector v
ab
, which is a displacement from point a to point
b, has coordinates
b
x
− a
x
b
y
− a
y
b
z
− a
z
. It is also possible to treat a point as if it
were a vector from the origin to the point itself.
The sum of two vectors is another vector: a + b =
a
x
+ b
x
a
y
+ b
y
a
z
+ b
z
.Ifyouadd
a vector to a point, the result is a new point that has been displaced by the vector. Vectors
can also be multiplied by a scalar: sa =
sa
x
sa
y
sa
z
. Subtraction is simply an addition
where one of the vectors has been multiplied by −1.
2.1.2 VECTOR PRODUCTS
There are two ways to multiply two 3D vectors. The dot product or scalar product of vectors
a and b can be defined in two different but equivalent ways:
a · b = a
x
b
x
+ a
y
b
y
+ a
z
b
z
(2.1)
a · b = cos(θ)||a||||b|| (2.2)
30 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2
The first definition is algebraic, using the vector coordinates. The latter definition is
geometric, and is based on the lengths of the two vectors (||a|| and ||b||), and the small-
est angle between them ( θ). An important property related to the angle is that when the
vectors are orthogonal, the cosine term and therefore the whole expression goes to zero.
This is illustrated in Figure 2.3.
The dot product allows us to compute the length, or norm, of a vector. We first com-
pute the dot product of the vector with itself using the algebraic formula: a · a. We then
note that θ = 0 and therefore cos(θ) = 1. Now, taking the square root of Equation (2.2)
yields the norm:
||a|| =
√
a · a. (2.3)
We can then normalize the vector so that it becomes unit length:
ˆ
a = a/||a||.
(2.4)
The other way to multiply two vectors in 3D is called the cross product. While the dot
product can be done in any coordinate system, the cross product only exists in 3D. The
cross product creates a new vector,
a × b =
a
y
b
z
− a
z
b
y
a
z
b
x
− a
x
b
z
a
x
b
y
− a
y
b
x
,
(2.5)
which is perpendicular to both a and b; see Figure 2.3. The new vector is also right-handed
with respect to a and b in the same way as shown in Figure 2.2. The length of the new
vector is sin(θ)||a||||b||.Ifa and b are parallel (θ = 0
◦
or θ = 180
◦
), the result is zero.
Finally, reversing the order of multiplication flips the sign of the result:
a × b = −b ×a.
(2.6)
a
.
b . 0
a
.
b
5 0
a
.
b
, 0
a
b
a
3 b
a
b
a
b
a
b
Figure2.3: The dot product produces a positive number when the vectors form an acute angle (less
than 90
◦
), zero when they are perpendicular (exactly 90
◦
), and negative when the angle is obtuse
(greater than 90
◦
). The cross product defines a third vector that is in a right-hand orientation and
perpendicular to both vectors.
SECTION 2.2 MATRICES 31
2.1.3 HOMOGENEOUS COORDINATES
Representing both points and direction vectors with three coordinates can be confusing.
Homogeneous coordinates are a useful tool to make the distinction explicit. We simply add
a fourth coordinate (w): if w = 0, we have a direction, otherwise a location.
If we have a homogeneous point [h
x
h
y
h
z
h
w
], we get the corresponding 3D point by
dividing the components by h
w
.Ifh
w
= 0 we would get a point infinitely far away, which
we interpret as a direction toward the point
h
x
h
y
h
z
. Conversely, we can homogenize
the point
p
x
p
y
p
z
by adding a fourth component:
p
x
p
y
p
z
1
. In fact, we can
use any non-zero w, and all such
wp
x
wp
y
wp
z
w
correspond to the same 3D point.
We can also see that with normalized homogeneous coordinates—for which w is either
1 or 0—taking a difference of two points creates a direction vector (w becomes 1 −1 = 0),
and adding a direction vector to a point displaces the point by the vector and yields a new
point (w becomes 1 + 0 = 1).
There is another, even more important, reason for adopting homogeneous 4D coordi-
nates instead of the more familiar 3D coordinates. They allow us to express all linear 3D
transformations using a 4 × 4 matrix that operates on 4 × 1 homogeneous vectors. This
representation is powerful enough to express translations, rotations, scalings, shearings,
and even perspective and parallel projections.
2.2 MATRICES
A 4 × 4 matrix M has components m
ij
where i stands for the row and j stands for the
column:
M =
⎡
⎢
⎢
⎢
⎣
m
00
m
01
m
02
m
03
m
10
m
11
m
12
m
13
m
20
m
21
m
22
m
23
m
30
m
31
m
32
m
33
⎤
⎥
⎥
⎥
⎦
, (2.7)
while a column vector v has components v
i
:
v =
⎡
⎢
⎢
⎢
⎣
v
0
v
1
v
2
v
3
⎤
⎥
⎥
⎥
⎦
=
[
v
0
v
1
v
2
v
3
]
T
.
(2.8)
The transpose operation above converts a row vector to column vector, and vice versa. We
will generally use column vectors in the rest of this book, but will write them in transposed
form: v =
[
v
0
v
1
v
2
v
3
]
T
.OnamatrixM = [m
ij
], transposition produces a matrix
32 LINEAR ALGEBRA FOR 3D GRAPHICS CHAPTER 2
that is mirrored with respect to the diagonal: M
T
= [m
ji
], that is, columns are switched
with rows.
2.2.1 MATRIX PRODUCTS
A matrix times a vector produces a new vector. Directions and positions are both trans-
formed by multiplying the corresponding homogeneous vector v with a transformation
matrix M as v
= Mv. Each component of this column vector v
is obtained by taking a
dotproductofarowofM with v; the first row (M
0•
) producing the first component, the
second row (M
1•
) producing the second component, and so on:
v
= Mv =
⎡
⎢
⎢
⎢
⎣
[
m
00
m
01
m
02
m
03
]
· v
[
m
10
m
11
m
12
m
13
]
· v
[
m
20
m
21
m
22
m
23
]
· v
[
m
30
m
31
m
32
m
33
]
· v
⎤
⎥
⎥
⎥
⎦
.
(2.9)
Note that for this to work, M needs to have as many columns as v has rows.
An alternative, and often more useful way when trying to understand the geometric mean-
ing of the matrix product, is to think of M being composed of four column vectors
M
•0
, ,M
•3
, each being multiplied by the corresponding component of v, and finally
being added up:
v
= Mv = v
0
⎡
⎢
⎢
⎢
⎣
m
00
m
10
m
20
m
30
⎤
⎥
⎥
⎥
⎦
+ v
1
⎡
⎢
⎢
⎢
⎣
m
01
m
11
m
21
m
31
⎤
⎥
⎥
⎥
⎦
+ v
2
⎡
⎢
⎢
⎢
⎣
m
02
m
12
m
22
m
32
⎤
⎥
⎥
⎥
⎦
+ v
3
⎡
⎢
⎢
⎢
⎣
m
03
m
13
m
23
m
33
⎤
⎥
⎥
⎥
⎦
.
(2.10)
The product of two matrices, on the other hand, produces another matrix, which can be
obtained from several products of a matrix and a vector. Simply break the columns of the
rightmost matrix apart into several column vectors, multiply each of them by the matrix
on the left, and join the results into columns of the resulting matrix:
AB =
A(B
•0
) A(B
•1
) A(B
•2
) A(B
•3
)
.
(2.11)
Note that in general matrix multiplication does not commute, that is, the order of
multiplication is important (AB = BA). The transpose of a product is the product of
transposes, but in the reverse order:
(AB)
T
= B
T
A
T
.
(2.12)
SECTION 2.2 MATRICES 33
Now we are ready to express the dot product as a matrix multiplication:
a · b = a
T
b =
a
0
a
1
a
2
⎡
⎢
⎣
b
0
b
1
b
2
⎤
⎥
⎦
, (2.13)
that is, transpose a into a row vector and multiply it with a column vector b.
2.2.2 IDENTITY AND INVERSE
The number one is special in the sense that when any number is multiplied with it, that
number remains unchanged (1 · a = a), and for any number other than zero there is an
inverse that produces one (a
1
a
= aa
−1
= 1). For matrices, we have an identity mat rix:
I =
⎡
⎢
⎢
⎢
⎣
1000
0100
0010
0001
⎤
⎥
⎥
⎥
⎦
(2.14)
A matrix multiplied by the identity matrix remains unchanged (M = IM = MI). If a
matrix M has an inverse wedenoteitbyM
−1
, and the matrix multiplied with its inverse
yields identity: MM
−1
= M
−1
M = I. Only square matrices, for which the number of
rows equals the number of columns, can have an inverse, and only the matrices where all
columns are linearly independent have inverses.
The inverse of a product of matrices is the product of inverses, in reverse order:
(AB)
−1
= B
−1
A
−1
.
(2.15)
Letuscheck:AB(AB)
−1
= ABB
−1
A
−1
= AIA
−1
= AA
−1
= I. We will give the inverses of
most transformations that we introduce, but in a general case you may need to use a
numerical method such as Gauss-Jordan elimination to calculate the inverse [Str03].
As discussed earlier, we can use 4 × 4 matrices to represent various transformations. In
particular, you can interpret every matrix as transforming a vertex to a new coordinate
system. If M
ow
transforms a vertex from its local coordinate system, the object coordinates,
to world coordinates (v
= M
ow
v), its inverse performs the transformation from world
coordinates to object coordinates (v = M
−1
ow
v
= M
wo
v
), that is, M
−1
ow
= M
wo
.
2.2.3 COMPOUND TRANSFORMATIONS
Transformation matr ices can be compounded. If M
ow
transformsavertexfromobject
coordinates to world coordinates, and M
we
transforms from world coordinates to eye