Feig, E. “Complexity Theory of Transforms in Signal Processing”
Digital Signal Processing Handbook
Ed. Vijay K. Madisetti and Douglas B. Williams
Boca Raton: CRC Press LLC, 1999
c
1999byCRCPressLLC
9
Complexity Theory of Transforms
in Signal Processing
Ephraim Feig
IBM Corporation
T.J. Watson Research Center
9.1 Introduction
9.2 One-Dimensional DFTs
9.3 Multidimensional DFTs
9.4 One-Dimensional DCTs
9.5 Multidimensional DCTs
9.6 Nonstandard Models and Problems
References
9.1 Introduction
Complexity theory of computation attempts to determine how “inherently” difficult are certain
tasks. For example, how inherently complex is the task of computing an inner product of two
vectors of length N? Certainly one can compute the inner product
N
j=1
x
j
y
j
by computing the
Nproductsx
j
y
j
and then summing them. But can one compute this inner product with fewer
than N multiplications? The answer is no, but the proof of this assertion is no trivial matter. One
first abstracts and defines the notions of the algorithm and its components (such as addition and
multiplication); then a theorem is proven that any algorithm for computing a bilinear form which
uses K multiplications can be transformed to a quadratic algorithm (some algorithm of a very special
form, which uses no divisions, and whose multiplications only compute quadratic forms) which uses
at most K multiplications [20]; and finally a proof by induction on the length N of the summands in
the inner product is made to obtain the lower bound result [6, 13, 22, 25]. We will not present the
details here; we just want to let the reader know that the process for even proving what seems to be
an intuitive result is quite complex.
Consider next the more complex task of computing the product of an N point vector by an M × N
matrix. This corresponds to the task of computing M separate inner products of N-point vectors. It
is tempting to jump to the conclusion that this task requires MN multiplications. But we should not
jump to fast conclusions. First, the M inner products are separate, but not independent (the term is
used loosely, and not in any linear algebra sense). After all, the second factor in the M inner products
is always the same. It turns out [6, 22, 25] that, indeed, our intuition this time is correct again. And
the proof is really not much more difficult than the proof for the complexity result for inner products.
In fact, once the general machinery is built, the proof is a slight extension of the previous case. So
far intuition proved accurate.
In complexity theory one learns early on to be skeptical of intuitions. An early surprising result in
complexitytheory— and to date still one of its most remarkable — contradicts the intuitive guess that
c
1999 by CRC Press LLC
computing the product of two 2 × 2 matrices requires 8 multiplications. Remarkably, Strassen [21]
has shown that it can be done with 7 multiplication. His algorithm is very nonintuitive; I am not
aware of any good algebraic explanation for it exceptfor the assertion that the mathematical identities
which define the algorithm indeed are valid. It can also be shown [15] that 7 is the minimum number
of multiplications required for the task.
The consequences of Strassen’s algorithm for general matrix multiplication tasks are profound.
The task of computing the product of two 4 × 4 matrices with real entries can be viewed as a task of
computing two 2 × 2 matrices whose entries are themselves 2 × 2 matrices. Each of the 7 multipli-
cations in Strassen’s algorithm now become matrix multiplications requiring 7 real multiplications
plus a bunch of additions; and each addition in Strassen’s algorithm becomes an addition of 2 × 2
matrices, which can be done with 4 real additions. This process of obtaining algorithms for large
problems, which are built up of smaller ones in a structures manner, is called the “nesting” proce-
dure [25]. It is a very powerful tool in both complexity theory and algorithm design. It is a special
form of recursion.
The set of N × N matrices form a noncommutative algebra. A branch of complexity theory
called “multiplicative complexity theory” is quite well established for certain relatively few algebras,
and wide open for the rest. In this theory complexity is measured by the number of “essential
multiplications.” Given an algebra over a field F, an algorithm is a sequence of arithmetic operations
in the algebra. A multiplication is called essential if neither factor is an element in F . If one of the
factors in a multiplication is an element in F, the operation is called a scaling.
Consider an algebra of dimension N over a field F, with basis b
1
,...,b
N
. An algorithm for
computing the product of two elements
N
j=1
f
j
b
j
and
N
j=1
g
j
b
j
with f
j
,g
j
∈ F is called
bilinear, if every multiplication in the algorithm is of the form L
1
(f
1
,...,f
N
) ∗ L
2
(g
1
,...,g
N
),
where L
1
and L
2
are linear forms and ∗ is the product in the algebra, and it uses no divisions.
Because none of the arithmetic operations in bilinear algorithms rely on the commutative nature
of the underlying field, these algorithms can be used to build recursively via the nesting process
algorithms for noncommutative algebras of increasingly large dimensions, which are built from the
smaller algebras via the tensor product. For example, the algebra of 4×4 matrices (over some field F;
I will stop adding this necessary assumption, as it will be obvious from content) is isomorphic to the
tensor product of the algebra of 2 × 2 matrices with itself. Likewise, the algebra of 16× 16 matrices
is isomorphic to the tensor product of the algebra of 4× 4 matrices with itself. And this proceeds to
higher and higher dimensions.
Suppose we have a bilinear algorithm for computing the product in an algebra T
1
of dimension D,
which uses M multiplications and A additions (including subtractions) and S scalings. The algebra
T
2
= T
1
⊗T
1
has dimension D
2
. By the nesting procedure we can obtain an algorithm for computing
the product in T
2
which uses M multiplications of elements in T
1
, A additions of elements in T
1
,
and S scalings of elements in T
1
. Each multiplication in T
1
requires M multiplications, A additions,
and S scalings; each addition in T
1
requires D additions; and each scaling in T
1
requires D scalings.
Hence, the total computational requirements for this new algorithm is M
2
multiplications, A(M+D)
additions and S(M + D) scalings. If the nesting procedure is continued to yield an algorithm for
the product in the D
4
dimensional algebra T
4
= T
2
⊗ T
2
, then its computational requirements
would be M
4
multiplications, A(M + D)(M
2
+ D
2
) additions and S(M + D)(M
2
+ D
2
) scalings.
One more iteration would yield an algorithm for the D
8
dimensional algebra T
8
= T
4
⊗ T
4
, which
uses M
8
multiplications, A(M + D)(M
2
+ D
2
)(M
4
+ D
4
) additions, M
8
multiplications, and
S(M + D)(M
2
+ D
2
)(M
4
+ D
4
) scalings. The general pattern should be apparent by now. We see
that the growth of the number of operations (the high order term, that is) is governed by M and not
by A or S. A major goal of complexity theory is the understanding of computational requirements
as problem sizes increase, and nesting is the natural way of building algorithms for larger and larger
problems. We see one reason why counting multiplications (as opposed to all arithmetic operations)
c
1999 by CRC Press LLC
became so important in complexity theory. (Historically, in the early days multiplications were
indeed much more expensive than additions.)
Algebras of polynomials are important in signal processing; filtering can be viewed as polynomial
multiplications. Theproductoftwopolynomialsofdegreesd
1
and d
2
canbecomputedwith d
1
+d
2
−1
multiplications. Furthermore, it is rather easy to prove (a straightforward dimension argument)
that this is the minimal number of multiplications necessary for this computation. Algorithms
which compute these products with these numbers of multiplications (so-called optimal algorithms)
are obtained using Lagrange interpolation techniques. For even moderate values of d
j
, they use
inordinately many additions and scalings. Indeed, they use (d
1
+ d
2
− 3)(d
1
+ d
2
− 2) additions,
and a half as many scalings. So these algorithms are not very practical, but they are of theoretical
interest. Also of interest is the asymptotic complexityof polynomial products. They can be computed
by embedding them in cyclic convolutions of sizes at most twice as long. Using FFT techniques,
these can be achieved with order D log D arithmetic operations, where D is the maximum of the
degrees. With optimal algorithms, while the number of (essential) multiplications is linear, the total
number of operations is quadratic. If nesting is used, then the asymptotic behavior of the number
of multiplications is also quadratic.
Convolution algebras are derived from algebras of polynomials. Given a polynomial P (u) of
degree D, one can define an algebra of dimension D whose entries are all polynomials of degree
less than D, with addition defined in the standard way, and multiplication is modulo P (u). Such
algebras are called convolution algebras. For polynomials P (u) = u
D
− 1, the algebras are cyclic
convolutionsofdimension D. For polynomials P (u) = u
D
+1, these algebras are called signed-cyclic
convolutions. The product of two polynomials modulo P (u) can be obtained from the product of
the two polynomials without any extra essential multiplications. Hence, if the degree of P (u) is D,
then the product modulo P (u) can be done with 2D − 1 multiplications. But can it be done with
fewer multiplications?
Whereas complexity theory has huge gaps in almost all areas, it has triumphed in convolution
algebras. The minimum number of multiplications required to compute a product in an algebra
is called the multiplicative complexity of the algebra. The multiplicative complexity of convolution
algebras (over infinite fields) is completely determined [22]. If P(u) factors (over the base field; the
role of the field will be discussed in greater detail soon) to a product of k irreducible polynomials,
then the multiplicative complexity of the algebra is 2D− k.SoifP (u) is irreducible, then the answer
to the question in the previous paragraph is no. Otherwise, it is yes.
The above complexity result for convolution algebras is a sharp bound. It is a lower bound in that
every algorithm for computing the product in the algebra requires at least 2D − k multiplications,
where k is the number of factors of the defining polynomial P (u). It is also an upper bound, in
that there are algorithms which actually achieve it. Let us factor P (u) =
P
j
(u) into a product of
irreduciblepolynomials (here we see the roleofthe field; moreaboutthissoon). Then the convolution
algebra modulo P (u) is isomorphic to a direct sum of algebras modulo P
j
(u); the isomorphism is via
the Chinese remainder theorem. The multiplicative complexity of the direct summands are 2d
j
− 1,
where d
j
are the degrees of P
j
(u); these are sharp bounds. The algorithm for the algebra modulo
P (u) is derived from these smaller algorithms; because of the isomorphism, putting them all together
requires no extra multiplications. The proof that this is a lower bound, first given by Winograd [23],
is quite complicated.
The above result is an example of a “direct sum theorem.” If an algebra is decomposable to a direct
sum of subalgebras, then clearly the multiplicative complexity of the algebra is less than or equal to
the sum of the multiplicative complexities of the summands. In some (relatively rare) circumstances
equality can beshown. The exampleofconvolutionalgebrasissucha case. The resultsforconvolution
algebras are very strong. Winograd has shown that every minimal algorithm for computing products
in a convolution algebra is bilinear and is a direct sum algorithm. The latter means that the algorithm
actually computes a minimal algorithm for each direct summand and then combines these results
c
1999 by CRC Press LLC
without any extra essential multiplications to yield the product in the algebra itself.
Things get interesting when we start considering algebras which are tensor products of convolution
algebras (these are called multi-dimensional convolution algebras). A simple example already is
enlightening. Consider the algebra C of polynomial multiplications modulo u
2
+1 overthe rationals
Q; this algebra is called the Gaussian rationals. The polynomial u
2
+ 1 is irreducible over Q (the
algebra is a field), so by the previous result, its multiplicative complexity is 3. The nesting procedure
would yield an algorithm the product in C ⊗ C which uses 9 multiplications. But it can in fact be
computed with 6 multiplications. The reason is due to an old theorem, probably due to Kroeneker
(though I cannot find the original proof); the reference I like best is Adrian Albert’s book [1]. The
theorem asserts that the tensor product of fields is isomorphic to a direct sum of fields, and the
proof of the theorem is actually a construction of this isomorphsim. For our example, the theorem
yields that the tensor product C ⊗ C is isomorphic to a direct sum of two copies of C. The product
in C ⊗ C can, therefore, be computed by computing separately the product in each of the two
direct summands, each with 3 multiplications, and the final result can be obtained without any
more essential multiplications. The explicit isomorphism was presented to the complexity theory
community by Winograd [22]. Since the example is sufficiently simple to work out, and the results
so fundamental to much of our later discussions, we will present it here explicitly.
Consider A, the polynomial ring modulo u
2
+ 1 over the Q. This is a field of dimension 2 over
Q, and it has the matrix representation (called its regular representation) given by
ρ(a + bu) =
a −b
ba
.
(9.1)
While for all b = 0 the matrix above is not diagonalizable over Q, the field (algebra) is diagonalizable
over the complexes. Namely,
1 i
1 −i
a −b
ba
1 i
1 −i
−1
=
a + ib 0
0 a − ib
.
(9.2)
The elements 1 and i of A correspond (in the regular representation) in the tensor algebra A⊗ A to
the matrices
ρ( 1 ) =
10
01
(9.3)
and
ρ( i ) =
0 −1
10
,
(9.4)
respectively. Hence, the 4 × 4 matrix
R =
ρ( 1 )ρ(i)
ρ( 1 )ρ(−i)
(9.5)
diagonalizes the algebra A ⊗ A. Explicitly, we can compute
10 0−1
01 1 0
10 0 1
01−10
x
0
−x
1
−x
2
−x
3
x
1
x
0
−x
3
x
2
x
2
−x
3
x
0
−x
1
x
3
x
2
x
1
x
0
10 0−1
01 1 0
10 0 1
01−10
−1
=
y
0
−y
1
00
y
1
y
0
00
00y
2
−y
2
00y
3
y
3
,
(9.6)
c
1999 by CRC Press LLC