Tải bản đầy đủ (.pdf) (20 trang)

Tài liệu Cơ sở dữ liệu hình ảnh P9 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (158.66 KB, 20 trang )

Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D. Bergman
Copyright
 2002 John Wiley & Sons, Inc.
ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
9 Transmission of Digital Imagery
JEFFREY W. PERCIVAL
University of Wisconsin, Madison, Wisconsin
VITTORIO CASTELLI
IBM T.J. Watson Research Center, Yorktown Heights, New York
9.1 INTRODUCTION
The transmission of digital imagery can impose significant demands on the
resources of clients, servers, and the networks that connect them. With the explo-
sive growth of the internet, the need for mechanisms that support rapid browsing
of on-line imagery has become critical.
This is particularly evident in scientific imagery. The increase in availability of
publicly accessible scientific data archives on the Internet has changed the typical
scientist’s expectations about access to data. In the past, scientific procedures
produced proprietary (PI-owned) data sets that, if exchanged at all, were usually
exchanged through magnetic tape. Now, the approach is to generate large on-
line data archives using standardized data formats and allow direct access by
researchers. For example, NASA’s Planetary Data System is a network of archive
sites serving data from a number of solar system missions. The Earth-Observing
System will use eight Distributed Active Archive Centers to provide access to
the very large volume of data to be acquired.
Although transmission bandwidth has been increasing with faster modems
for personal networking, upgrades to the Internet, and the introduction of new
networks such as Internet 2 and the Next Generation Internet, there are still
a number of reasons because of which bandwidth for image-transmission will
continue to be a limited resource into the foreseeable future.
Perhaps the most striking of these factors is the rapid increase in the resolution of


digital imagery and hence the size of images to be transmitted. A case in point is the
rapid increase in both pixel count and intensity resolution provided by the solid-
state detectors in modern astronomical systems. For example, upgrading from a
640 × 480 8-bit detector to a 2,048 × 2,048 16-bit detector represents a data growth
of about a factor of 30. Now, even 2,048
2
detectors seem small; new astronomical
241
242 TRANSMISSION OF DIGITAL IMAGERY
detectors are using 2,048 × 4,096 detectors in mosaics to build units as large as
8,192
2
pixels. This is a factor of 400 larger than the previously mentioned 8-bit
image. The Sloan Digital Sky Survey will use thirty 2,048
2
detectors simultaneously
and will produce a 40 terabyte data set during its lifetime.
This phenomenon is not limited to astronomy. For example, medical imaging
detectors are reaching the spatial and intensity resolution of photographic film
and are replacing it. Similarly, Earth-observing satellites with high-resolution
sensors managed by private companies produce images having 12,000
2
pixels,
which are commercialized for civilian use.
In addition to image volume (both image size and number of images), other
factors are competing for transmission bandwidth, including the continued growth
in demand for access and the competition for bandwidth from other application
such as telephony and videoconferencing. Therefore, it does not seem unrea-
sonable to expect that the transmission of digital imagery will continue to be a
challenge requiring careful thought and a deft allocation of resources.

The transmission of digital imagery is usually handled through the exchange of
raw, uncompressed files, losslessly compressed files, or files compressed with some
degree of lossiness chosen in advance at the server. The files are first transmitted and
then some visualization program is invoked on the received file. Another type of
transmission, growing in popularity with archives of large digital images, is called
progressive transmission. When an image is progressively transmitted from server
to client, it can be displayed by the client as the data arrive, instead of having to wait
until the transmission is complete. This allows browsing in an archive even over
connections for which the transmission time of a single image may be prohibitive.
In this chapter each of these transmission schemes and their effect on the allocation
of resources between server, network, and client are discussed.
9.2 BULK TRANSMISSION OF RAW DATA
This is the simplest case to consider. Error-free transmission of raw digital images
is easily done using the file transfer protocol (FTP) on any Internet-style (TCP/IP)
connection. Image compression can be used to mitigate the transmission time by
decreasing the total number of bytes to be transmitted.
When used to decrease the volume of transmitted data, the compression usually
needs to be lossless, as further data analysis is often performed on the received
data sets. A lossless compression is exactly reversible, in that the exact value
of each pixel can be recovered by reversing the compression. Many compres-
sion algorithms are lossy, that is, the original pixel values cannot be exactly
recovered from the compressed data. Joint photographic experts group (JPEG)[1]
and graphics interchange format (GIF) (see Chapter 8) are examples of such
algorithms. Lossy compression is universally used for photographic images trans-
mitted across the World Wide Web. Interestingly, it is becoming increasingly
important in transmitting scientific data, especially in applications in which the
images are manually interpreted, rather than processed, and where the bandwidth
between server and client is limited.
PROGRESSIVE TRANSMISSION 243
Compression exploits redundancy in an image, which can be large for certain

kinds of graphics such as line drawings, vector graphics, and computer-generated
images. Lossless compression of raw digital imagery is far less efficient because
digital images from solid-state detectors contain electronic noise, temperature-
dependent dark counts, fixed-pattern noise, and other artifacts. These effects
reduce the redundancy, for example, by disrupting long runs of pixels that would
otherwise have the same value in the absence of noise. A rule of thumb is that
lossless compression can reduce the size of a digital image by a factor of 2 or 3.
The cost of transmitting compressed images has three components: the cost of
compression, the cost of decompression, and the cost of transmission. The latter
decreases with the effectiveness of the compressed algorithm: the bigger the
achieved compression ratio, the smaller the transmission cost. However, the first
two costs increase with the achieved compression ratio: when comparing two
image-compression algorithms, one usually finds that the one that compresses
better complex is more
1
. Additionally, the computational costs of compressing
and decompressing are quite often similar (although asymmetric schemes exist
where compression is much more expensive than decompression). In an image
database, compression is usually performed once (when the image is ingested into
the database) and therefore its cost is divided over all the transmissions of the
image. Hence, the actual trade-off is between bandwidth and decompression cost,
which depends on the client characteristics. Therefore, the compression algorithm
should be selected with client capabilities in mind.
A number of high-performance, lossless image-compression algorithms exist.
Most use wavelet transforms of one kind or another, although simply using a
wavelet basis is no guarantee that an algorithm is exactly reversible. A well-tested,
fast, exactly reversible, wavelet-based compression program is HCOMPRESS[2].
Source code is available at www.stsci.edu.
Finally, it is easily forgotten in this highly networked world that sometimes
more primitive methods of bulk transmission of large data sets still hold some

sway. For example, the effective bandwidth of a shoebox full of Exabyte tapes
sent by overnight express easily exceeds 100 megabits per second for 24 hours.
9.3 PROGRESSIVE TRANSMISSION
Progressive transmission is a scheme in which the image is transmitted from
server to client in such a way that the client can display the image as it arrives,
instead of waiting until all the data have been received.
Progressive transmission fills an important niche between the extremes of
transmitting either raw images in their entirety or irreversibly reduced graphic
products.
1
This assertion needs to be carefully qualified: to compress well in practice, an algorithm must be
tailored to the characteristics of actual images. For example, a simple algorithm that treats each pixel
as an independent random variable does not perform as well on actual images as a complex method
that accounts for dependencies between neighboring pixels.
244 TRANSMISSION OF DIGITAL IMAGERY
Progressive transmission allows users to browse an archive of large digital
images, perhaps searching for some particular features. It has also numerous
applications in scientific fields. Meteorologists often scan large image archives
looking for certain types or percentages of cloud cover. The ability to receive,
examine, and reject an image while receiving only the first 1 percent of its data
is very attractive. Astronomers are experimenting with progressive techniques
for remote observing. They want to check an image for target position, filter
selection, and telescope focus before beginning a long exposure, but the end-to-
end network bandwidth between remote mountain tops and an urban department
of astronomy can be very low. Doing a quality check on a 2,048
2
pixel, 16-
bit image over a dial-up transmission control protocol (TCP) connection seems
daunting, but it is easily done with progressive image-transmission techniques.
Some progressive transmission situations are forced on a system. The Galileo

probe to Jupiter was originally equipped with a 134,400 bit-per-second transmission
system, which would allow an image to be transmitted in about a minute. The high-
gain antenna failed to deploy; however, it resulted in a Jupiter-to-Earth bandwidth
of only about 10 bits per second. Ten days per image was too much! Ground system
engineers devised a makeshift image browsing system using spatially subsampled
images (called jail bars, typically one or two image rows spaced every 20 rows) to
select images for future transmission. Sending only every twentieth row improves
the transmission time, but the obvious risk of missing smaller-scale structure in
the images is severe. Figure 9.1 shows the discovery image of Dactyl, the small
moon orbiting the asteroid Ida. Had Dactyl been a little smaller, this form of image
browsing might have prevented its discovery.
Ideally, progressive transmission should have the following properties.
• It should present a rough approximation of the original image very quickly. It
should improve the approximation rapidly at first and eventually reconstruct
the original image.
Figure 9.1. Discovery image of Dactyl, a small moon orbiting the asteroid Ida. Had the
moon been a little smaller, it could have been missing in the transmitted data.
PROGRESSIVE TRANSMISSION 245
• It should capture features at all spatial and intensity scales early in the
transmission, that is, broad, faint features should be captured in the early
stages of transmission as easily as bright, localized features.
• It should support interactive transmission, in which the client can use the
first approximations to select “regions of interest,” which are then scheduled
for transmission at a priority higher than that of the original image. By
“bootstrapping” into a particular region of an image based on an early view,
the client is effectively boosting bandwidth by discarding unneeded bits.
• No bits should be sent twice. As resolution improves from coarse to fine,
even with multiple overlapping regions of interest having been requested,
the server must not squander bandwidth by sending the client information
that it already has.

• It should allow interruption or cancellation by the client, likely to occur
while browsing images.
• It should be well behaved numerically, approximately preserving the image
statistics (e.g., the pixel-intensity histogram) throughout the transmission.
This allows numerical analysis of a partially transmitted image.
Progressive image transmission is not really about compression. Rather, it is
better viewed as a scheduling problem, in which one wants to know which bits
to send first and which bits can wait until later. Progressive transmission uses the
same algorithms as compression, simply because if a compression algorithm can
tell a compressor which bits to throw away, it can also be used to sense which
bits are important.
9.3.1 Theoretical Considerations
In this section, we show that progressive transmission need not require much
more bandwidth than nonprogressive schemes.
To precisely formulate the problem, we compare a simple nonprogressive
scheme to a simple progressive scheme using Figure 9.2. Let the nonprogressive
scheme be ideal, in the sense that in order to send an image with distortion (e.g.,
mean-square error, MSE, or Hamming distance) no larger than D
N
,itneedsto
send R
N
bits per pixel and that the point (R
N
,D
N
) lies on the rate-distortion
curve[3] defined in Chapter 8, Section 8.3. No other scheme can send fewer bits
and produce an image of the same quality.
The progressive scheme has two stages. In the first, it produces an image

having distortion no larger than D
1
by sending R
1
bits per pixel and in the
second it improves the quality of the image to D
2
= D
N
by sending further
R
2
bits per pixel. Therefore, both schemes produce an image having the same
quality.
Our wish list for the progressive scheme contains two items: first we would
like to produce the best possible image during the initial transmission, namely, we
wish the point (R
1
,D
1
) to lie on the rate-distortion curve, namely R
1
= R(D
1
)
(constraint nr 1), second, we wish overall to transmit the same number of bits
246 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5

1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)
(
R
N
,
D
N
) = (
R
1
+
R

2
,
D
N
)
Second stage
First stage
Ideal behavior of progressive transmission
(
R
1
,
D
1
)
Figure 9.2. Ideal behavior of a successive refinement system: the points describing the
first and second stage lie on the rate-distortion curve R(D).
R
N
as the nonprogressive scheme, that is, we wish R
1
+ R
2
= R
N
= R(D
N
) =
R(D
2

) (constraint nr 2).
In this section it is shown that it is not always possible to satisfy both
constraints and that recent results show that constraints 1 and 2 can be relaxed
to R
1
<R(D
1
) + 1/2andR
1
+ R
2
≤ R(D
2
) + 1/2, respectively.
The first results are due to Equitz and Cover [4], who showed that the answer
depends on the source (namely, on the statistical characteristics of the image),
and that although, in general, the two-stage scheme requires a higher rate than the
one-stage approach, there exist necessary and sufficient conditions under which
R
1
= R(D
1
) and R
1
+ R
2
= R(D
2
) (the sources for which the two equalities
hold for every D

1
and D
2
are called successively refinable
2
). An interesting ques-
tion is whether there are indeed sources that do not satisfy Equitz and Cover’s
2
More specifically, a sufficient condition for a source to be successively refinable is that the original
data I , the better approximation I
2
, and the coarse approximation I
1
form a Markov chain in this
order, that is, if I
1
is conditionally independent of I given I
2
. In simpler terms, the conditional
independence condition means that if we are given the finer approximation of the original image,
our uncertainty about the coarser approximation is the same regardless of whether we are given the
original image.
PROGRESSIVE TRANSMISSION 247
conditions. Unfortunately, sources that are not successively refinable do exist:
an example of such a source over a discrete alphabet is described in Ref. [3],
whereas Ref. [5] contains an example of a continuous source that is not succes-
sively refinable. This result is somewhat problematic: it seems to indicate that
the rate-distortion curve can be used to measure the performance of a progressive
transmission scheme only for certain types of sources.
The question of which rates are achievable was addressed by Rimoldi [6], who

refined the result of Equitz and Cover: by relaxing the condition that R
1
= R(D
1
)
and (R
1
+ R
2
) = R(D
2
), the author provided conditions under which pairs of
rates R
1
and R
2
can be achieved, given fixed distortions D
1
and D
2
. Interestingly,
in Ref. [6], D
1
and D
2
need not be obtained with the same distortion measure.
This is practically relevant: progressive transmission methods in which the early
stages produce high-quality approximations of small portions of the image and
poor-quality renditions of the rest are discussed later (e.g., in telemedicine, a
radiographic image could be transmitted with a method that quickly provides

a high-quality image of the area of interest and a blurry version of the rest of
the image [7], which can be improved in later stages. Here, the first distortion
measure could concentrate on the region of interest, whereas subsequent measures
could take into account the image as a whole.)
Although Rimoldi’s regions can be used to evaluate the performance of a
progressive transmission scheme, a more recent result [8] provides a simpler
answer. For any source producing independent and identically distributed
samples
3
under squared error distortion for an m-step progressive transmission
scheme, any fixed set of m distortion values D
1
>D
2
>D
m
, Lastras and
Berger showed that there exists an m-step code that operates within 1/2 bit of the
rate-distortion curve at all of its steps (Fig. 9.3). This is a powerful result, which
essentially states that the rate-distortion curve can indeed be used to evaluate a
progressive transmission scheme: an algorithm that at some step achieves a rate
that is not within 1/2 bit of the rate-distortion curve is by no means optimal and
can be improved upon.
The theory of successive refinements plays the same role for progressive trans-
mission as the rate-distortion theory for lossy compression.
In particular, it confirms that it is possible, in principle, to construct progressive
transmission schemes that achieve transmission rates comparable to those of
nonprogressive methods.
This theory also provides fundamental limits to what is achievable and guide-
lines to evaluate how well a specific algorithm performs under given assumptions.

Such guidelines are sometimes very general and hence difficult to specialize to
actual algorithms. However, the field is relatively young and very active; some of
the more recent results can be applied to specific categories of progressive trans-
mission schemes, such as the bounds provided in Ref. [9] for multiresolution
coding.
3
Lastras’s thesis also provides directions on how to extend the result to stationary ergodic sources.
248 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)+1/2

R
(
D
)
(
R
1
+
R
2
,
D
N
)
Second stage
(
R
1
,
D
1
)
First stage
1/2 bit
Attainable behavior of progressive transmission
Figure 9.3. Attainable behavior of a progressive transmission scheme: each stage is
described by a point that lies within 1/2 bits of the rate-distortion curve.
Note, finally, that there is a limitation to the applicability of the theory of
successive refinements to image transmission: a good distortion measure that is
well-matched to the human visual system is not known. The implications are

discussedinSection9.3.3.
9.3.2 Taxonomies of Progressive Transmission Algorithms
Over the years, a large number of progressive transmission schemes have
appeared in the literature. Numerous progressive transmission methods have been
developed starting from a compression scheme, selecting parts of the compressed
data for transmission at each stage, and devising algorithms to reconstruct images
from the information available at the receiver after each transmission stage. The
following taxonomy, proposed by Tsou [10] and widely used in the field, is
well suited to characterize this class of algorithm because it focuses on the
characteristics of the compression scheme. Tsou’s taxonomy divides progressive
transmission approaches into three classes:
Spatial-Domain Techniques. Algorithms belonging to this class compress
images without transforming them. A simple example consists of dividing the
PROGRESSIVE TRANSMISSION 249
image into bit planes, compressing the bit planes separately, and scheduling the
transmission to send the compressed bit planes in order of significance. Dividing
an 8-bit image I into bit planes consists of creating eight images with pixel
values equal to 0 or 1 — the first image contains the most significant bit of the
pixels of I , the last image contains the least significant bit, and the intermediate
images contain the intermediate bits.
A more interesting example is progressive vector quantization (VQ): the image
I is first vector-quantized (Chapter 8, Section 8.6.2) at low rate, namely, with
a small codebook, which produces an approximation I
1
. The difference I − I
1
between the original image and the first coarse approximation is further quantized,
possibly with a different codebook, to produce I
2
, and the process is repeated.

The encoded images I
1
,I
2
, are transmitted in order and progressively recon-
structed at the receiver.
Transform-Domain Techniques. Algorithms belonging to this category trans-
form the image to a specific space (such as the frequency domain), and compress
the transform. Examples of this category include progressive JPEG and are
discussedinSection9.4.1.
Pyramid-Structured Techniques. This category contains methods that rely on
a multiresolution pyramid, which is a sequence of approximations of the orig-
inal image I at progressively coarser resolution and larger scale (i.e., having
smaller size). The coarsest approximation is losslessly compressed and trans-
mitted; subsequent steps consist of transmitting only the information necessary
to reconstruct the next finer approximation from the received data. Schemes
derived from compression algorithms that rely on subband coding or on the
wavelet transform (Chapter 8, Sections 8.5.3 and 8.8.2) belong to this category,
and in this sense, the current category overlaps the transform-domain techniques.
Numerous progressive transmission algorithms developed in recent years are
not well categorized by Tsou’s taxonomy.
Chee [11] recently proposed a different classification system, which
uses Tsou’s taxonomy as a secondary categorization. Chee’s taxonomy
specifically addresses the transmission mechanism, rather than the compression
characteristics, and contains four classes:
Multistage Residual Methods. This class contains algorithms that progressively
reduce the distortion of the reconstructed image. Chee assigns to this class only
methods that operate on the full-resolution image at each stage: multiresolution-
based algorithms are assigned to the next category. This category includes multi-
stage VQ [12] and the transform-coding method proposed in Ref. [13].

Our discussion of successive refinements in Section 9.3.1. is directly relevant
to this class of methods.
Hierarchical Methods. These algorithms analyze the images at different scales
to process them in a hierarchical fashion. Chee divides this class into nonresidual
coder, residual multiscale coders, and filter-bank coders.
250 TRANSMISSION OF DIGITAL IMAGERY
• Nonresidual coders perform a multiscale decomposition of the image and
include quadtree-coders [14,15], binary-tree coders [16], spatial pyramid
coders [17,18], and subsampling pyramids [19].
• Residual coders differ from nonresidual coders in that they compute and
encode the residual image at each level of the decomposition (the difference
between the original image and what is received at that stage). The well-
known Laplacian pyramid can be used to construct a hierarchical residual
coder [20]. A theoretical analysis of this category of coders can be found
in Ref. [21].
• Filter-bank coders include wavelet-based coders and subband coders.
Wavelet-based coders send the lowest resolution version of the image
first and successively transmit the subbands required to produce the
approximation at the immediately higher resolution. A similar approach is
used in subband coders [22]. The theoretical analysis of Ref. [9] is directly
relevant to this group of methods.
Successive Approximation Methods. This class contains methods that progres-
sively refine the precision (e.g., the number of bits) of the reconstructed approx-
imations. Methods that transmit bit planes belong to this category: at each stage
the precision of the reconstructed image increases by 1 bit. Chee assigns to this
category bit planes methods, tree-structured vector quantizers, full-search quan-
tizers with intermediate codebooks [23], the embedded zerotree wavelet coder
(Chapter 8, Section 8.8.2), and the successive approximation mode of the JPEG
standard (Section 9.4.1.1).
Note that these methods, and in particular, transform-domain methods, are not

guaranteed to monotonically improve the fidelity of the reconstructed image at
each stage, if the fidelity is measured with a single-letter distortion measure (i.e.,
a measure that averages the distortions of individual pixels) [1].
Methods Based on Transmission Sequences. In this category, Chee groups
methods that use a classifier to divide the data into portions, prioritize the order
in which different portions are transmitted, and include a protocol for specifying
transmission order to the receiver. The prioritization process can aim at different
goals, such as reducing the MSE or improving the visual appearance of the
reconstructed image at each step.
In Ref. [11] the author assigns to this class the spectral selection method of
the JPEG standard (Section 9.4.1.1), Efstratiadi’s Filter Bank Coder [24], and
several block-based spatial domain coders [25,26].
9.3.3 Comparing Progressive Transmission Schemes
Although it is possible to compare different progressive transmission
schemes [11], no general guidelines exist. However, the following broad
statements can be made:.
SOME EXAMPLES 251
• The application field and the characteristics of the data might directly
determine the most appropriate transmission approach. In some cases,
starting from a thumbnail (which progressively increases in size) is
preferable to receiving, for example, the low-contrast full-size image
produced by a bit plane successive approximation algorithm (which becomes
progressively sharper) or the blocky full-size image produced by a multistage
vector quantizer. In other cases, the thumbnail is the least useful of the three
representations. There is no mathematical way of characterizing which of
the three images is more suitable for a specific application domain.
• The theory of successive refinements provides a reference point to assess
and compare bandwidth requirements. It can be very useful for comparing
similar approaches but might not be effective for comparing heterogeneous
methods, such as a multiresolution scheme and a successive approximation

algorithm. Additionally, different compression methods produce different
types of artifacts in intermediate images; the lack of a distortion measure
matched to the human visual system means that an intermediate image with
certain artifacts can be substantially preferable to another with different
artifacts, even though their measured distortion is the same.
• The specific application often imposes constraints on the characteristics of
the associated compression algorithm and consequently reduces the set of
possible candidate transmission schemes. For instance, some applications
might require lossless compression, whereas others can tolerate loss; more-
over, different lossy compression algorithms introduce different artifacts,
whose severity depends on the intended use of the data.
9.4 SOME EXAMPLES
9.4.1 Transform Coding
Because raw digital images are typically difficult to compress, one often resorts
to representing the image in another mathematical space (see Chapter 8). The
Fourier transform (FT, Chapter 16), the discrete cosine transform (DCT, Chap-
ters 8 and 16 ), and the wavelet transform (WT, Chapters 8 and 16) are examples
of transformations that map images into mathematical spaces having properties
that facilitate compression and transmission. They represent images as linear
combinations of appropriately selected functions called basis functions.
The DCT and WT in particular are at the heart of numerous progressive
transmission schemes. The next two sections analyze them in detail.
9.4.1.1 Discrete Cosine Transform Coding. The DCT uses cosines of varying
spatial frequencies as basis functions. Lossy JPEG uses a particular type of DCT
called block-DCT: the image is first divided into nonoverlapping square blocks of
size 8 × 8 pixels, and each tile is independently transformed with DCT. In JPEG,
the reversible DCT step is followed by irreversible quantization of the DCT coef-
ficients. The table used for quantization is not specified by the standard and, in
252 TRANSMISSION OF DIGITAL IMAGERY
principle, could be selected to match the requirements of specific applications.

In practice, there is no algorithm to accomplish this task and even the quantiza-
tion tables recommended in the standard for use with photographic images were
constructed by hand, using essentially a trial-and-error approach.
DCT coding supports progressive transmission easily, if not optimally.
Progressive JPEG is now supported in many Web browsers. There are two main
progressive JPEG modes.
The first progressive transmission mode uses a successive approximation
scheme. During the first pass a low-precision approximation of all the DCT
coefficients is transmitted to the client. Using this information the client
can reconstruct a full-sized initial image, but one that will tend to have
numerous artifacts. Subsequent transmissions successively reduce errors in the
approximation of the coefficients and the artifacts in the image. The rationale
for this approach is that the largest coefficients capture a great deal of the image
content even when transmitted at reduced precision.
The second method is a spectral selection approach which belongs to the
transmission-sequence category. First the server sends the DC pixels (corre-
sponding to spatial frequency equal to zero) of each image block to the client.
The client can then reconstruct and display a scaled-down version of the original
image, containing one pixel per each 8
2
block. Each pixel contains the average
value of the 8
2
block. In each subsequent transmission stage the server sends
the AC coefficients required to double the size of the received image, until all
the data is received by the client. This scheme is an effective way of generating
thumbnail images without redundant transmission of bits. The thumbnail can be
used to select, say, the desired image, and at the beginning of the transmission the
client can use the thumbnail to populate the DC pixel locations of an otherwise
empty DCT buffer. The server begins the full transmission not by repeating the

DC values, which the client already has, but starts right in with the first scan of
low-frequency pixels. The main problem with using DCTs in a tiled image for
progressive transmission is that the tile size imposes a minimum resolution on
the image as a whole. That is, every 8
2
tile contributes to the first scan regardless
of its similarity to its neighbors. Bandwidth is used to express the DC level of
each tile, even when that tile’s DC level may be shared by many of its neighbors:
the basis functions of the DCT do not exploit coherence on scales larger than 8
2
.
9.4.1.2 Integer Wavelet Transform Coding. Using wavelet transforms for
image processing and compression is a subject that is both vast and deep. We will
not delve deeply into the mathematics or algorithms of wavelet transformations
but refer the reader to Chapter 8 for an introduction and for references.
In this section, we restrict ourselves to fast, exactly reversible, integer-based
transforms. It turns out that a number of different wavelet bases meet these
criteria. Therefore, we will discuss in generic terms what such transforms look
like, what properties they lend to progressive transmission, and how they optimize
the use of transmission bandwidth.
SOME EXAMPLES 253
In a generic integer wavelet transform, the first pass scans through the image
and forms nearest-neighbor sums and differences of pixels. On a second pass, it
scans through what is essentially a 2 × 2 block average of the original image, one
whose overall dimensions have been reduced by a factor of 2. After forming the
nearest-neighbor sums and differences, the third pass then operates on a 4 × 4
block averaged version of the original. This hierarchical decomposition of the
image lends the wavelet transform its special powers. As the data are transmitted
by bit plane, all wavelet coefficients that capture details too small or faint to be
of interest at a certain stage are equal to zero.

A characteristic appearance of an image that has been wavelet-inverted using
only partially received coefficients is the variation in pixel size: pixels are big
where the image is not changing very much and they are small near edges and
small-scale features where the image is changing rapidly. This is a truly dynamic
bandwidth management. Bandwidth is used just where it is needed (expressing
small pixels) and is conserved where not needed (the large, fixed areas in an
image). It is also automatic, in that it falls naturally out of the transform. No
special image processing, such as edge detection or content determination, is
required.
There are an endless variety of integer-based wavelet transforms. They vary
according to small differences in how they calculate their nearest-neighbor sums
and differences. It turns out that there is not a large difference in performance
between them. Figure 9.4 shows the performance of different transform coders,
measured as mean-square error as a function of transmitted data volume. For our
purposes, the horizontal axis is proportional to elapsed transmission time for a
constant bandwidth.
We notice several facts. First, the DCT coder is obviously worse during the
early stages of the transmission. Second, the family of integer-based wavelet
coders perform pretty much alike. The differences between them are always
much smaller than the difference between their envelope and the performance
of the DCT coder. Note that the DCT coder used for Figure 9.4 is a bit plane
coder, not the pixel-ordering coders that are officially sanctioned by the JPEG
standard.
9.4.2 Color Images
Color images provide an extra point of attack on bandwidth usage: we can exploit
the correlation in content between the separate red, green, and blue color compo-
nents. Any good file compressor, like JPEG, exploits this correlation between
colors as well, but it has particular relevance in progressive transmission.
Color images are typically expressed using three component images, one each
in the colors of red, green, and blue (RGB). Often, there is a strong correlation

in image content between each of the color bands. A classic method of dealing
with this correlation, used for decades in color TV broadcasting, is to switch the
scene into a different color space using the transformation
Y = 0.299R + 0.587G + 0.114B
254 TRANSMISSION OF DIGITAL IMAGERY
10
4
1000
100
10
1
0.1
0.01
0.001
Quality (zero = perfect)
0.01
0.1 1 10 100
No transform
DCT transform
Haar transform
S + P transform
TS transform
W22S transform
W3M transform
Percentage of image received
Figure 9.4. Mean-squared error as a function of fraction of image received.
Cb = 128 − (0.167734R) − (0.331264G) + (05B)
Cr = 128 + (0.5R) − (0.418688G) − (0.081312B)
and reverse the transformation when displaying the image (There are many
variants on this transformation; we use this one to illustrate the idea.) The Y-

component is called the luminance and can be used as a gray scale representation
of the original image. This signal is used by black and white TV sets when
receiving a color signal. The other two components represent color differences
and are much less correlated in image content. These “chroma” channels are
often very compressible.
A good progressive image-transmission system should exploit such correla-
tions between the components of multiband imaging. Such correlation is not
restricted to RGB systems, for example, multiband spectral imaging detectors
in Earth-observing systems often produce highly correlated image components.
There may be multiple groups of correlated components, with the number of
correlated spectral bands in each group being much larger than three. A transfor-
mation similar to that mentioned earlier can be designed for multiband imagery
with any number of components.
SOME EXAMPLES 255
9.4.3 Extraction, Coding, and Transmission of Data
We will now step through an actual progressive transmission and discuss the
details of the process. Our intent is to outline the steps and considerations any
good progressive image-transmission system should have, instead of exhaustively
documenting any specific process or step.
The server first ingests the image. The image can be transformed either on-the-
fly or in advance. The choice depends on whether the server has been supplied
with preprocessed, previously transformed images. Preprocessing is used to avoid
the CPU processing required to perform the transformation in real time.
Wavelet transforms produce negative numbers for some pixels, and even if
the absolute value of the coefficient is small, sign extension in 32-bit or 64-
bit integers can generate many unneeded 1-bits that can foil the entropy coders
(Chapter 8). Signs must be handled separately, usually by separating them from
their associated values and dealing with a completely nonnegative transform.
Signs must be carefully reunited with their associated coefficients at the client in
time for them to express the desired effect before the client inverts the transform.

The server then extracts bit planes, or blocks of pixels, and performs entropy
coding on them to reduce their size. Entropy coding may be done in advance as
well, although this complicates the serving of regions of interest. If the progres-
sive transmission is interactive, the server may need to reverse the entropy coding
and rebuild the image in order to extract the requested region of interest.
Bit clipping is an important part of the interactive mode of progressive trans-
mission. The client may request overlapping regions of interest and the degree of
overlap may be quite high. For example, the client may be cropping an image and
the crop may be slightly wrong; recropping with just a small change in size or
position can result in a great deal of overlap. Because the goal of serving regions
of interest is to represent the requested region exactly (losslessly), there can be a
large transmission penalty in mismanaging such regions. A well-behaved inter-
active server will keep track of the bit plane sections and pixel blocks already
sent and will clip (zero) out duplicate bits before sending.
In an interactive session, it is important to quantize the transmitted data
messages into reasonably timed units. That is, when sending a bit plane that
may take, for example, a minute to complete, the client may be blocked from
requesting and receiving regions of interest before that time. The transactions
must be quantized such that the server can respond to interactive requests and
interpolate the transmission of the newly requested areas before falling back to
the originally scheduled bits.
9.4.4 Examples
A progressive image-transmission system that meets each of the ideal capabilities
of Section 9.1 has been implemented at the University of Wisconsin. Here we
show some examples using this system.
Figure 9.5 shows a true-color (24-bit) image progressively transmitted to a
byte count that is only 0.1 percent of the original. This represents a compression
256 TRANSMISSION OF DIGITAL IMAGERY
ratio of nearly 1,000. Note the effect of the dynamic spatial resolution: the pixels
are small and dense around the edge and face of the child but are large and

sparse in the background, where the file cabinets present a visually uninteresting
background. Although a detailed inspection indicates that there is much to be
improved, holding the picture at arm’s length nevertheless shows that this point
in the progression serves nicely as a thumbnail image.
Figure 9.6 shows the same image as in Figure 9.5, but captured at a factor of
300 in lossiness. Considerable improvement is seen in fingers, forehead, and hair.
In Figure 9.7 we jump all the way to a factor of 80 in lossiness. Remember
that this is still only about 1 percent of the total image volume. We use examples
at such aggressive levels of lossiness because of the high performance of the
algorithm; at the 10 percent point, the image is visually indistinguishable from
the original and no further improvement is apparent.
Finally, in Figure 9.8 we show the effects of selecting regions of interest.
In this case, the approximation of Figure 9.5 was used to outline a box using
the mouse in our graphical user interface. Upon selection, the region of interest
was immediately extracted from the wavelet transform, previously sent bits were
clipped (zeroed out) to avoid retransmission, and the region was progressively
sent to the client.
Figure 9.5. Result of sending 0.1 percent of the coefficients. A color version of this figure
can be downloaded from />tech med/image databases.
SOME EXAMPLES 257
Figure 9.6. Result of sending 0.3 percent of the coefficients. A color version of this figure
can be downloaded from />tech med/image databases.
Figure 9.7. Result of sending 1 percent of the coefficients. A color version of this figure
can be downloaded from />tech med/image databases.
258 TRANSMISSION OF DIGITAL IMAGERY
Figure 9.8. Selecting a region of interest in an image: the region of interest is extracted
from the wavelet transform, previously sent bits are clipped to avoid retransmission,
and the region is progressively sent to the client. A color version of this figure can be
downloaded from />tech med/image databases.
9.5 SUMMARY

Progressive transmission of large digital images offers the powerful new capa-
bility of interactively browsing large image archives over relatively slow network
connections. Browsers can enjoy an effective bandwidth that appears as much as
one hundred times greater than the actual bandwidth delivered by the network
infrastructure.
Interactivity allows users to examine an image very quickly and then either
reject it as a candidate for complete transmission or outline regions of interest
that quickly follow at a higher priority.
Progressive transmission is a form of image compression in which lossiness
is played out as a function of elapsed time rather than file size as in normal
compression. No degree of lossiness need be chosen in advance, and different
clients can customize the quality of the received image simply by letting more
time pass during its receipt.
Even when fully transmitting a lossless image, progressive transmission is still
more efficient than transmitting the raw image, because the transform and entropy
coding at its core are the same as the ones used in normal file compression.
REFERENCES
1. W. Pennebaker and J.L. Mitchell, JPEG Still Image Data Compression Standard,Van
Nostrand Reinhold, New York, 1993.
REFERENCES 259
2. R. White, High performance compression of astronomical images, Technical Report,
Space Telescope Science Institute, 1992.
3. T. Berger, Rate Distortion Theory: a Mathematical Basis for Data Compression,
Prentice Hall, Englewood Cliffs, N.J., 1971.
4. W. Equitz and T. Cover, Successive refinements of information, IEEE Trans. Infor-
mation Theory 37, 269–275 (1991).
5. J. Chowan and T. Berger, Failure of Successive Refinement for Symmetric Gaussian
Mixtures, IEEE Trans. Information Theory 43, 350–352, (1991).
6. B. Rimoldi, Successive refinements of information: Characterization of the Achiev-
able Rates, IEEE Trans. Information Theory 40, 253–259 (1994).

7. J. Chow, Interactive selective decompression of medical images, Proc. IEEE Nucl.
Sci. Symp. 3, 1855– 1858 (1996).
8. L. Lastras and T. Berger, All sources are nearly successively refinable, IEEE Trans.
Inf. Theory (2001), in press.
9. M. Effros, Distortion-rate bounds forfixed- and variable-rate multiresolution source
codes, IEEE Trans. Inf. Theory 45, 1887– 1910 (1999).
10. K H. Tsou, Progressive image transmission: a review and comparison of techniques,
Opt. Eng. 26(7), 581– 589 (1987).
11. Y K. Chee, A survey of progressive image transmission methods, Int. J. Imaging
Systems Techno. (2000).
12. B H. Juang and A. Gray, Multiple stage vector quantization for speech coding, Proc.
IEEE ICASSP 82 1, 597–600 (1982).
13. L. Wang and M. Goldberg, Progressive image transmission by transform coefficient
residual error quantization, IEEE Trans. Commun. 36, 75– 87 (1988).
14. R. Blanford, Progressive refinement using local variance estimators, IEEE Trans.
Commun. 41(5), 749–759 (1993).
15. G. Sullivan and R. Baker, Efficient quadtree coding of images and video, IEEE Trans.
Image Process. 3, 327–331 (1994).
16. K. Knowlton, Progressive transmission of grey-scale and binary pictures by simple,
efficient, and lossless encoding schemes, Proc. IEEE 68(7), 885 –896 1980.
17. S. Tanimoto, Image transmission with gross information first, Technical Report 77-
10-06, Department of computer science, University of Washington, Seattle, 1977.
18. M. Goldberg and L. Wang, Comparative performance of pyramid data structures for
progressive image transmission, IEEE Trans. Commun. 39(4), 540– 547 (1991).
19. M. Viergever and P. Roos, Hierarchical interpolation, IEEE Eng. Med. Bio. 12, 48–55
(1993).
20. P. Burt and A.E.H., The Laplacian pyramid as a compact image coder, IEEE Trans.
Commun. 31(4), 532–540, (1983).
21. S. Park and S. Lee, The coding gains of pyramid structures in progressive image
transmission, Proc. SPIE Vis. Commun. Image Process.’90 1360, 1702– 1710 (1990).

22. P. Westerink, J. Biemond, and D. Boekee, Progressive transmission of images using
image coding, Proc. IEEE ICASSP 89 1811 –1814 (1989).
23. E. Riskin, R. Ladner, R Y. Wang, and L. Atlas, Index assignment for progressive
transmission of full-search vector quantization, IEEE Trans. Image Process., 3(3),
307–311 (1994).
260 TRANSMISSION OF DIGITAL IMAGERY
24. S. Efstratiadis, B. Rouchouze, and M. Kunt, Image compression using subband/
wavelet transform and adaptive multiple-distribution entropy coding, Proc. SPIE Vis.
Commun. Image Process. ’92 1818, 753–764 (1992).
25. N. Subramanian, A. Kalhan, and V. Udpikar, Sketch and texture coding approach to
monochrome image coding, Int. Conf. Image Process. Appl., 29–32 FIND (1992).
26. S. Caron and J F. Rivest, Progressive image transmission by segmentation- based
coding, J. Vis. Commun. Image Represent. 7(3), 296 –303 (1996).

×