Idea of Transform Coding
1) Transform the input pixels x0,x2,...,xN-1 into
coefficients c0,c1,...,cN-1 (real values):
CSE 592
Data Compression
– The coefficients should have the property that most of
them are near zero.
– Therefore most of the “energy” is compacted into a few
coefficients.
Fall 2006
Lecture 14:
Transform Coding
JPEG
2) Scalar quantize the coefficient:
– This is bit allocation!
– Important coefficients should have more quantization
levels (== represented with more accuracy).
3) Entropy encode the quantized values.
3
Decoding
Block Diagram of Transform Coding
1) Entropy decode the quantized values.
Encoder
input
x
2) Compute approximate coefficients c’0,c’1,...,c’N-1
from the quantized values.
3) Inverse transform c’0,c’1,...,c’N-1 to x’0,x’1,...,x’N-1
which is (we hope) a good approximation of the
original x0,x1,...,xN-1.
coefficients
c
transform
symbols
entropy
decoding
s
symbols
quantization
bit allocation
bit stream
s
entropy
coding
output
coefficients
decode
symbols
c’
b
inverse
transform
x’
Decoder
4
5
Mathematical Properties of Transforms
Why Coefficients?
A c=x
T
a 00 L
M
a 0,N -1 L
• Linear Transformation:
– Defined by a real N x N matrix A = (aij)
a 00 L a 0,N-1 x 0 c 0
M M = M
M
aN-1,0 L aN-1,N-1 xN-1 c N-1
a N -1,0 c 0 x 0
M M = M
a N -1, N -1 c N -1 x N -1
a N -1,0
a 00
x0
=
+
+
c
c
M
L
M
N −1 M
0
a N -1,N -1
a 0,N -1
x N -1
• Orthonormality: A −1 = A T (transpose)
basis vectors
coefficients
6
7
1
CuuDuongThanCong.com
/>
Why Orthonomality?
Example (with Compaction)
1 1 1
2 1 − 1
A 2 = I ⇒ A −1 = A
A=
• The energy of the data equals the energy of
the coefficients:
N−1
∑c
i=0
2
i
A T = A = A −1
= c T c = (Ax)T (Ax)
N−1
= (x T A T )(Ax) = x T (A T A)x = x T x = ∑ x i
orthonormal
1 1 1 b 2b
=
2 1 − 1 b 0
2
i =0
compaction
8
9
Discrete Cosine Transform (1D)
aij =
1
i=0
N
2
(2 j + 1)iπ
cos
N
2N
Basis Vectors (1D)
1
1
0.5
0.5
0
0
i >0
0
1
2
0
3
-0.5
-0.5
-1
-1
1
row 0
2
3
row 1
N = 4:
.5
.5
.5
.5
.65328 .270598 - .270598 - .65328
A=
.5
- .5
- .5
.5
.270598 - .65328 .65328 - .270598
1
1
0.5
0.5
0
0
0
1
0
3
1
2
3
-1
-1
row 3
row 2
10
Decomposition in Terms of Basis
Vectors
11
Block Transform
Image
.5
.653281
.5
.270598
x0
.5
.270598
−.5
- .653281
c +
c + c +
c = x1
0
1
2
3
.5
− .270598
−.5
x2
.653281
.270598
−
.5
.653281
.5
x3
DC coefficient
2
-0.5
-0.5
Each 8x8 block is
individually coded
AC coefficients
12
13
2
CuuDuongThanCong.com
/>
2-Dimensional Block Transform
8x8 DCT Basis Vectors
Transform
Block of pixels X
a00
a
10
A=
a20
a30
x00 x01 x02 x03
x10 x11 x12 x13
x20 x21 x22 x23
x30 x31 x32 x33
a01 a02
a11 a12
a21 a22
a31 a32
a03
a13
a23
a33
N−1
Transform rows rij = ∑ a kj xik
k =0
N −1
N−1
N−1
N −1 N −1
m= 0
m =0
k =0
m= 0 k = 0
Transform columns c ij = ∑ aimrmj = ∑ a im ∑ akj x mk = ∑
Summary: C = AXA
∑a
a x mk
im kj
T
14
15
Importance of Coefficients
Quantization Tables
• The DC coefficient is (in general) the most important.
• The AC coefficients (in general) become less
important as they are farther from the DC coefficient.
• One example bit allocation (bits per coefficient):
• For an nxn block we construct an nxn matrix Q:
– Qij indicates how large the quantization interval is for
coefficient cij.
8 7 5 3 2 1 0 0
7
5
3
2
1
0
0
5
3
2
1
0
0
0
3
2
1
0
0
0
0
2
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
• Encode cij with the label:
compression
55 bits for 64
pixels = .86 bpp
• Decode sij to:
c'ij = sijQij.
16
Example Quantization
c' = 2 ⋅ 24 = 48
• c= 54.2, Q = 6
17
Example Quantization Table (JPEG)
• c = 54.2, Q = 24 s = 54.2 + 0.5 = 2
24
• c = 54.2, Q = 12
c ij
sij = + 0.5
Qij
54.2
s=
+ 0.5 = 5
12
c' = 5 ⋅12 = 60
54.2
s=
+ 0.5 = 9
6
c' = 9 ⋅ 6 = 54
16
12
11
12
10
14
16
19
24
26
40
58
51
60
61
55
14
14
18
24
49
72
13
17
33
35
64
92
16
22
37
55
78
95
24
29
56
64
87
98
40
51
68
81
103
112
57
87
109
104
121
100
69
80
103
113
120
103
56
62
77
92
101
99
Increase the bit rate = halve the size of quantization intervals.
Decrease the bit rate = double the size of quantization intervals.
18
19
3
CuuDuongThanCong.com
/>
JPEG (1987)
Block Transmission
• Let P = [pij], 0 < i,j < N be an image with 0 < pij < 256.
• Center the pixels around zero:
• DC coefficient:
– DC coefficients don’t change much from block to
neighboring block. Hence, their labels change
even less.
– Predictive coding with differences is used to code
the DC label.
– xij = pij - 128
• Code 8x8 blocks of X using the DCT.
• Choose a quantization table:
– The table depends on the desired bit rate and is built into
JPEG.
• AC coefficients:
• Quantize the coefficients according to the quantization
table:
– Do a zig-zag coding.
– The quantization symbols can be positive or negative.
• Transmit the labels (in a coded way) for each block.
20
Zig-Zag Coding
21
Example Block of Labels
• AC labels are coded in zig-zag order:
– Usually use a special entropy coding to take advantage of
the ordering of the bit allocation (quantization).
-8
2
0
0
0
0
0
0
0
0
0
0
0
0
0
3
1
1
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Coding order of AC labels
2 –8 3 0 0 0 0 1 1 0 0 1 0 0 .....
22
Coding Labels
Coding AC Label Sequence
• Categories of labels
–
–
–
–
1
2
3
4
• A symbol has three parts (Z,C,B)
{0}
{-1, 1}
{-3,-2,2,3}
{-7,-6,-5,-4,4,5 6 7}
– Z for number of zeros preceding a label 0 < Z < 15
– C for the category of the of the label
– B for a C-1 bit number for the actual label
• End of Block symbol (EOB) means the rest of
the block is zeros. EOB = (0,0,-)
• Example: 2 –8 3 0 0 0 0 1 1 0 0 1 0 0 .....
• Label is indicated by two numbers Cat,B
• Examples label C,B
0
2
-4
23
1
3, 2
4, 3
(0,3,2)(0,5,7)(0,3,3)(4,2,1)(0,2,1)(2,2,1)(0,0,-)
24
25
4
CuuDuongThanCong.com
/>
Coding AC Label Sequence
• Z,C have a prefix code
• B is a C-1 bit number
0
0
1
Z 2
3
1010
1
Notes on Transform Coding
• Video Coding:
Partial prefix
code table
C
2
– MPEG uses DCT.
– H.263 and H.261 use DCT.
3
00
01
100
1100
11011
11110001
1110
11111001
1111110111
111010
111110111
111111110101
• Audio Coding:
– MP3 = MPEG 1 Layer 3 uses DCT.
• Alternative Transforms:
– Lapped transforms remove some of the blocking
artifacts.
– Wavelet transforms do not need to use blocks at
all.
(0,3,2) (0,5,7) (0,3,3) (4,2,1) (0,2,1) (2,2,1) (0,0,-)
100 10 11010 0111 100 11 1111111000 1 01 1 11111001 1 1010
46 bits representing 64 pixels = .72 bpp
26
27
5
CuuDuongThanCong.com
/>