Tài liệu Cơ sở dữ liệu hình ảnh P9 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (158.66 KB, 20 trang )

Image Databases: Search and Retrieval of Digital Imagery
Edited by Vittorio Castelli, Lawrence D. Bergman
Copyright
 2002 John Wiley & Sons, Inc.
ISBNs: 0-471-32116-8 (Hardback); 0-471-22463-4 (Electronic)
9 Transmission of Digital Imagery
JEFFREY W. PERCIVAL
University of Wisconsin, Madison, Wisconsin
VITTORIO CASTELLI
IBM T.J. Watson Research Center, Yorktown Heights, New York
9.1 INTRODUCTION
The transmission of digital imagery can impose signiﬁcant demands on the
resources of clients, servers, and the networks that connect them. With the explo-
sive growth of the internet, the need for mechanisms that support rapid browsing
of on-line imagery has become critical.
This is particularly evident in scientiﬁc imagery. The increase in availability of
publicly accessible scientiﬁc data archives on the Internet has changed the typical
scientist’s expectations about access to data. In the past, scientiﬁc procedures
produced proprietary (PI-owned) data sets that, if exchanged at all, were usually
exchanged through magnetic tape. Now, the approach is to generate large on-
line data archives using standardized data formats and allow direct access by
researchers. For example, NASA’s Planetary Data System is a network of archive
sites serving data from a number of solar system missions. The Earth-Observing
System will use eight Distributed Active Archive Centers to provide access to
the very large volume of data to be acquired.
Although transmission bandwidth has been increasing with faster modems
for personal networking, upgrades to the Internet, and the introduction of new
networks such as Internet 2 and the Next Generation Internet, there are still
a number of reasons because of which bandwidth for image-transmission will
continue to be a limited resource into the foreseeable future.
Perhaps the most striking of these factors is the rapid increase in the resolution of

digital imagery and hence the size of images to be transmitted. A case in point is the
rapid increase in both pixel count and intensity resolution provided by the solid-
state detectors in modern astronomical systems. For example, upgrading from a
640 × 480 8-bit detector to a 2,048 × 2,048 16-bit detector represents a data growth
of about a factor of 30. Now, even 2,048
2
detectors seem small; new astronomical
241
242 TRANSMISSION OF DIGITAL IMAGERY
detectors are using 2,048 × 4,096 detectors in mosaics to build units as large as
8,192
2
pixels. This is a factor of 400 larger than the previously mentioned 8-bit
image. The Sloan Digital Sky Survey will use thirty 2,048
2
detectors simultaneously
and will produce a 40 terabyte data set during its lifetime.
This phenomenon is not limited to astronomy. For example, medical imaging
detectors are reaching the spatial and intensity resolution of photographic ﬁlm
and are replacing it. Similarly, Earth-observing satellites with high-resolution
sensors managed by private companies produce images having 12,000
2
pixels,
which are commercialized for civilian use.
In addition to image volume (both image size and number of images), other
factors are competing for transmission bandwidth, including the continued growth
in demand for access and the competition for bandwidth from other application
such as telephony and videoconferencing. Therefore, it does not seem unrea-
sonable to expect that the transmission of digital imagery will continue to be a
challenge requiring careful thought and a deft allocation of resources.

The transmission of digital imagery is usually handled through the exchange of
raw, uncompressed ﬁles, losslessly compressed ﬁles, or ﬁles compressed with some
degree of lossiness chosen in advance at the server. The ﬁles are ﬁrst transmitted and
then some visualization program is invoked on the received ﬁle. Another type of
transmission, growing in popularity with archives of large digital images, is called
progressive transmission. When an image is progressively transmitted from server
to client, it can be displayed by the client as the data arrive, instead of having to wait
until the transmission is complete. This allows browsing in an archive even over
connections for which the transmission time of a single image may be prohibitive.
In this chapter each of these transmission schemes and their effect on the allocation
of resources between server, network, and client are discussed.
9.2 BULK TRANSMISSION OF RAW DATA
This is the simplest case to consider. Error-free transmission of raw digital images
is easily done using the ﬁle transfer protocol (FTP) on any Internet-style (TCP/IP)
connection. Image compression can be used to mitigate the transmission time by
decreasing the total number of bytes to be transmitted.
When used to decrease the volume of transmitted data, the compression usually
needs to be lossless, as further data analysis is often performed on the received
data sets. A lossless compression is exactly reversible, in that the exact value
of each pixel can be recovered by reversing the compression. Many compres-
sion algorithms are lossy, that is, the original pixel values cannot be exactly
recovered from the compressed data. Joint photographic experts group (JPEG)[1]
and graphics interchange format (GIF) (see Chapter 8) are examples of such
algorithms. Lossy compression is universally used for photographic images trans-
mitted across the World Wide Web. Interestingly, it is becoming increasingly
important in transmitting scientiﬁc data, especially in applications in which the
images are manually interpreted, rather than processed, and where the bandwidth
between server and client is limited.
PROGRESSIVE TRANSMISSION 243
Compression exploits redundancy in an image, which can be large for certain

kinds of graphics such as line drawings, vector graphics, and computer-generated
images. Lossless compression of raw digital imagery is far less efﬁcient because
digital images from solid-state detectors contain electronic noise, temperature-
dependent dark counts, ﬁxed-pattern noise, and other artifacts. These effects
reduce the redundancy, for example, by disrupting long runs of pixels that would
otherwise have the same value in the absence of noise. A rule of thumb is that
lossless compression can reduce the size of a digital image by a factor of 2 or 3.
The cost of transmitting compressed images has three components: the cost of
compression, the cost of decompression, and the cost of transmission. The latter
decreases with the effectiveness of the compressed algorithm: the bigger the
achieved compression ratio, the smaller the transmission cost. However, the ﬁrst
two costs increase with the achieved compression ratio: when comparing two
image-compression algorithms, one usually ﬁnds that the one that compresses
better complex is more
1
. Additionally, the computational costs of compressing
and decompressing are quite often similar (although asymmetric schemes exist
where compression is much more expensive than decompression). In an image
database, compression is usually performed once (when the image is ingested into
the database) and therefore its cost is divided over all the transmissions of the
image. Hence, the actual trade-off is between bandwidth and decompression cost,
which depends on the client characteristics. Therefore, the compression algorithm
should be selected with client capabilities in mind.
A number of high-performance, lossless image-compression algorithms exist.
Most use wavelet transforms of one kind or another, although simply using a
wavelet basis is no guarantee that an algorithm is exactly reversible. A well-tested,
fast, exactly reversible, wavelet-based compression program is HCOMPRESS[2].
Source code is available at www.stsci.edu.
Finally, it is easily forgotten in this highly networked world that sometimes
more primitive methods of bulk transmission of large data sets still hold some

sway. For example, the effective bandwidth of a shoebox full of Exabyte tapes
sent by overnight express easily exceeds 100 megabits per second for 24 hours.
9.3 PROGRESSIVE TRANSMISSION
Progressive transmission is a scheme in which the image is transmitted from
server to client in such a way that the client can display the image as it arrives,
instead of waiting until all the data have been received.
Progressive transmission ﬁlls an important niche between the extremes of
transmitting either raw images in their entirety or irreversibly reduced graphic
products.
1
This assertion needs to be carefully qualiﬁed: to compress well in practice, an algorithm must be
tailored to the characteristics of actual images. For example, a simple algorithm that treats each pixel
as an independent random variable does not perform as well on actual images as a complex method
that accounts for dependencies between neighboring pixels.
244 TRANSMISSION OF DIGITAL IMAGERY
Progressive transmission allows users to browse an archive of large digital
images, perhaps searching for some particular features. It has also numerous
applications in scientiﬁc ﬁelds. Meteorologists often scan large image archives
looking for certain types or percentages of cloud cover. The ability to receive,
examine, and reject an image while receiving only the ﬁrst 1 percent of its data
is very attractive. Astronomers are experimenting with progressive techniques
for remote observing. They want to check an image for target position, ﬁlter
selection, and telescope focus before beginning a long exposure, but the end-to-
end network bandwidth between remote mountain tops and an urban department
of astronomy can be very low. Doing a quality check on a 2,048
2
pixel, 16-
bit image over a dial-up transmission control protocol (TCP) connection seems
daunting, but it is easily done with progressive image-transmission techniques.
Some progressive transmission situations are forced on a system. The Galileo

probe to Jupiter was originally equipped with a 134,400 bit-per-second transmission
system, which would allow an image to be transmitted in about a minute. The high-
gain antenna failed to deploy; however, it resulted in a Jupiter-to-Earth bandwidth
of only about 10 bits per second. Ten days per image was too much! Ground system
engineers devised a makeshift image browsing system using spatially subsampled
images (called jail bars, typically one or two image rows spaced every 20 rows) to
select images for future transmission. Sending only every twentieth row improves
the transmission time, but the obvious risk of missing smaller-scale structure in
the images is severe. Figure 9.1 shows the discovery image of Dactyl, the small
moon orbiting the asteroid Ida. Had Dactyl been a little smaller, this form of image
browsing might have prevented its discovery.
Ideally, progressive transmission should have the following properties.
• It should present a rough approximation of the original image very quickly. It
should improve the approximation rapidly at ﬁrst and eventually reconstruct
the original image.
Figure 9.1. Discovery image of Dactyl, a small moon orbiting the asteroid Ida. Had the
moon been a little smaller, it could have been missing in the transmitted data.
PROGRESSIVE TRANSMISSION 245
• It should capture features at all spatial and intensity scales early in the
transmission, that is, broad, faint features should be captured in the early
stages of transmission as easily as bright, localized features.
• It should support interactive transmission, in which the client can use the
ﬁrst approximations to select “regions of interest,” which are then scheduled
for transmission at a priority higher than that of the original image. By
“bootstrapping” into a particular region of an image based on an early view,
the client is effectively boosting bandwidth by discarding unneeded bits.
• No bits should be sent twice. As resolution improves from coarse to ﬁne,
even with multiple overlapping regions of interest having been requested,
the server must not squander bandwidth by sending the client information
that it already has.

• It should allow interruption or cancellation by the client, likely to occur
while browsing images.
• It should be well behaved numerically, approximately preserving the image
statistics (e.g., the pixel-intensity histogram) throughout the transmission.
This allows numerical analysis of a partially transmitted image.
Progressive image transmission is not really about compression. Rather, it is
better viewed as a scheduling problem, in which one wants to know which bits
to send ﬁrst and which bits can wait until later. Progressive transmission uses the
same algorithms as compression, simply because if a compression algorithm can
tell a compressor which bits to throw away, it can also be used to sense which
bits are important.
9.3.1 Theoretical Considerations
In this section, we show that progressive transmission need not require much
more bandwidth than nonprogressive schemes.
To precisely formulate the problem, we compare a simple nonprogressive
scheme to a simple progressive scheme using Figure 9.2. Let the nonprogressive
scheme be ideal, in the sense that in order to send an image with distortion (e.g.,
mean-square error, MSE, or Hamming distance) no larger than D
N
,itneedsto
send R
N
bits per pixel and that the point (R
N
,D
N
) lies on the rate-distortion
curve[3] deﬁned in Chapter 8, Section 8.3. No other scheme can send fewer bits
and produce an image of the same quality.
The progressive scheme has two stages. In the ﬁrst, it produces an image

having distortion no larger than D
1
by sending R
1
bits per pixel and in the
second it improves the quality of the image to D
2
= D
N
by sending further
R
2
bits per pixel. Therefore, both schemes produce an image having the same
quality.
Our wish list for the progressive scheme contains two items: ﬁrst we would
like to produce the best possible image during the initial transmission, namely, we
wish the point (R
1
,D
1
) to lie on the rate-distortion curve, namely R
1
= R(D
1
)
(constraint nr 1), second, we wish overall to transmit the same number of bits
246 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5

1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)
(
R
N
,
D
N
) = (
R
1
+
R

2
,
D
N
)
Second stage
First stage
Ideal behavior of progressive transmission
(
R
1
,
D
1
)
Figure 9.2. Ideal behavior of a successive reﬁnement system: the points describing the
ﬁrst and second stage lie on the rate-distortion curve R(D).
R
N
as the nonprogressive scheme, that is, we wish R
1
+ R
2
= R
N
= R(D
N
) =
R(D
2

) (constraint nr 2).
In this section it is shown that it is not always possible to satisfy both
constraints and that recent results show that constraints 1 and 2 can be relaxed
to R
1
<R(D
1
) + 1/2andR
1
+ R
2
≤ R(D
2
) + 1/2, respectively.
The ﬁrst results are due to Equitz and Cover [4], who showed that the answer
depends on the source (namely, on the statistical characteristics of the image),
and that although, in general, the two-stage scheme requires a higher rate than the
one-stage approach, there exist necessary and sufﬁcient conditions under which
R
1
= R(D
1
) and R
1
+ R
2
= R(D
2
) (the sources for which the two equalities
hold for every D

1
and D
2
are called successively reﬁnable
2
). An interesting ques-
tion is whether there are indeed sources that do not satisfy Equitz and Cover’s
2
More speciﬁcally, a sufﬁcient condition for a source to be successively reﬁnable is that the original
data I , the better approximation I
2
, and the coarse approximation I
1
form a Markov chain in this
order, that is, if I
1
is conditionally independent of I given I
2
. In simpler terms, the conditional
independence condition means that if we are given the ﬁner approximation of the original image,
our uncertainty about the coarser approximation is the same regardless of whether we are given the
original image.
PROGRESSIVE TRANSMISSION 247
conditions. Unfortunately, sources that are not successively reﬁnable do exist:
an example of such a source over a discrete alphabet is described in Ref. [3],
whereas Ref. [5] contains an example of a continuous source that is not succes-
sively reﬁnable. This result is somewhat problematic: it seems to indicate that
the rate-distortion curve can be used to measure the performance of a progressive
transmission scheme only for certain types of sources.
The question of which rates are achievable was addressed by Rimoldi [6], who

reﬁned the result of Equitz and Cover: by relaxing the condition that R
1
= R(D
1
)
and (R
1
+ R
2
) = R(D
2
), the author provided conditions under which pairs of
rates R
1
and R
2
can be achieved, given ﬁxed distortions D
1
and D
2
. Interestingly,
in Ref. [6], D
1
and D
2
need not be obtained with the same distortion measure.
This is practically relevant: progressive transmission methods in which the early
stages produce high-quality approximations of small portions of the image and
poor-quality renditions of the rest are discussed later (e.g., in telemedicine, a
radiographic image could be transmitted with a method that quickly provides

a high-quality image of the area of interest and a blurry version of the rest of
the image [7], which can be improved in later stages. Here, the ﬁrst distortion
measure could concentrate on the region of interest, whereas subsequent measures
could take into account the image as a whole.)
Although Rimoldi’s regions can be used to evaluate the performance of a
progressive transmission scheme, a more recent result [8] provides a simpler
answer. For any source producing independent and identically distributed
samples
3
under squared error distortion for an m-step progressive transmission
scheme, any ﬁxed set of m distortion values D
1
>D
2
>D
m
, Lastras and
Berger showed that there exists an m-step code that operates within 1/2 bit of the
rate-distortion curve at all of its steps (Fig. 9.3). This is a powerful result, which
essentially states that the rate-distortion curve can indeed be used to evaluate a
progressive transmission scheme: an algorithm that at some step achieves a rate
that is not within 1/2 bit of the rate-distortion curve is by no means optimal and
can be improved upon.
The theory of successive reﬁnements plays the same role for progressive trans-
mission as the rate-distortion theory for lossy compression.
In particular, it conﬁrms that it is possible, in principle, to construct progressive
transmission schemes that achieve transmission rates comparable to those of
nonprogressive methods.
This theory also provides fundamental limits to what is achievable and guide-
lines to evaluate how well a speciﬁc algorithm performs under given assumptions.

Such guidelines are sometimes very general and hence difﬁcult to specialize to
actual algorithms. However, the ﬁeld is relatively young and very active; some of
the more recent results can be applied to speciﬁc categories of progressive trans-
mission schemes, such as the bounds provided in Ref. [9] for multiresolution
coding.
3
Lastras’s thesis also provides directions on how to extend the result to stationary ergodic sources.
248 TRANSMISSION OF DIGITAL IMAGERY
0 1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Distortion
D
(MSE)
R
(
D
) (bits)
R
(
D
)+1/2

R
(
D
)
(
R
1
+
R
2
,
D
N
)
Second stage
(
R
1
,
D
1
)
First stage
1/2 bit
Attainable behavior of progressive transmission
Figure 9.3. Attainable behavior of a progressive transmission scheme: each stage is
described by a point that lies within 1/2 bits of the rate-distortion curve.
Note, ﬁnally, that there is a limitation to the applicability of the theory of
successive reﬁnements to image transmission: a good distortion measure that is
well-matched to the human visual system is not known. The implications are

discussedinSection9.3.3.
9.3.2 Taxonomies of Progressive Transmission Algorithms
Over the years, a large number of progressive transmission schemes have
appeared in the literature. Numerous progressive transmission methods have been
developed starting from a compression scheme, selecting parts of the compressed
data for transmission at each stage, and devising algorithms to reconstruct images
from the information available at the receiver after each transmission stage. The
following taxonomy, proposed by Tsou [10] and widely used in the ﬁeld, is
well suited to characterize this class of algorithm because it focuses on the
characteristics of the compression scheme. Tsou’s taxonomy divides progressive
transmission approaches into three classes:
Spatial-Domain Techniques. Algorithms belonging to this class compress
images without transforming them. A simple example consists of dividing the
PROGRESSIVE TRANSMISSION 249
image into bit planes, compressing the bit planes separately, and scheduling the
transmission to send the compressed bit planes in order of signiﬁcance. Dividing
an 8-bit image I into bit planes consists of creating eight images with pixel
values equal to 0 or 1 — the ﬁrst image contains the most signiﬁcant bit of the
pixels of I , the last image contains the least signiﬁcant bit, and the intermediate
images contain the intermediate bits.
A more interesting example is progressive vector quantization (VQ): the image
I is ﬁrst vector-quantized (Chapter 8, Section 8.6.2) at low rate, namely, with
a small codebook, which produces an approximation I
1
. The difference I − I
1
between the original image and the ﬁrst coarse approximation is further quantized,
possibly with a different codebook, to produce I
2
, and the process is repeated.

The encoded images I
1
,I
2
, are transmitted in order and progressively recon-
structed at the receiver.
Transform-Domain Techniques. Algorithms belonging to this category trans-
form the image to a speciﬁc space (such as the frequency domain), and compress
the transform. Examples of this category include progressive JPEG and are
discussedinSection9.4.1.
Pyramid-Structured Techniques. This category contains methods that rely on
a multiresolution pyramid, which is a sequence of approximations of the orig-
inal image I at progressively coarser resolution and larger scale (i.e., having
smaller size). The coarsest approximation is losslessly compressed and trans-
mitted; subsequent steps consist of transmitting only the information necessary
to reconstruct the next ﬁner approximation from the received data. Schemes
derived from compression algorithms that rely on subband coding or on the
wavelet transform (Chapter 8, Sections 8.5.3 and 8.8.2) belong to this category,
and in this sense, the current category overlaps the transform-domain techniques.
Numerous progressive transmission algorithms developed in recent years are
not well categorized by Tsou’s taxonomy.
Chee [11] recently proposed a different classiﬁcation system, which
uses Tsou’s taxonomy as a secondary categorization. Chee’s taxonomy
speciﬁcally addresses the transmission mechanism, rather than the compression
characteristics, and contains four classes:
Multistage Residual Methods. This class contains algorithms that progressively
reduce the distortion of the reconstructed image. Chee assigns to this class only
methods that operate on the full-resolution image at each stage: multiresolution-
based algorithms are assigned to the next category. This category includes multi-
stage VQ [12] and the transform-coding method proposed in Ref. [13].

Our discussion of successive reﬁnements in Section 9.3.1. is directly relevant
to this class of methods.
Hierarchical Methods. These algorithms analyze the images at different scales
to process them in a hierarchical fashion. Chee divides this class into nonresidual
coder, residual multiscale coders, and ﬁlter-bank coders.
250 TRANSMISSION OF DIGITAL IMAGERY
• Nonresidual coders perform a multiscale decomposition of the image and
include quadtree-coders [14,15], binary-tree coders [16], spatial pyramid
coders [17,18], and subsampling pyramids [19].
• Residual coders differ from nonresidual coders in that they compute and
encode the residual image at each level of the decomposition (the difference
between the original image and what is received at that stage). The well-
known Laplacian pyramid can be used to construct a hierarchical residual
coder [20]. A theoretical analysis of this category of coders can be found
in Ref. [21].
• Filter-bank coders include wavelet-based coders and subband coders.
Wavelet-based coders send the lowest resolution version of the image
ﬁrst and successively transmit the subbands required to produce the
approximation at the immediately higher resolution. A similar approach is
used in subband coders [22]. The theoretical analysis of Ref. [9] is directly
relevant to this group of methods.
Successive Approximation Methods. This class contains methods that progres-
sively reﬁne the precision (e.g., the number of bits) of the reconstructed approx-
imations. Methods that transmit bit planes belong to this category: at each stage
the precision of the reconstructed image increases by 1 bit. Chee assigns to this
category bit planes methods, tree-structured vector quantizers, full-search quan-
tizers with intermediate codebooks [23], the embedded zerotree wavelet coder
(Chapter 8, Section 8.8.2), and the successive approximation mode of the JPEG
standard (Section 9.4.1.1).
Note that these methods, and in particular, transform-domain methods, are not

guaranteed to monotonically improve the ﬁdelity of the reconstructed image at
each stage, if the ﬁdelity is measured with a single-letter distortion measure (i.e.,
a measure that averages the distortions of individual pixels) [1].
Methods Based on Transmission Sequences. In this category, Chee groups
methods that use a classiﬁer to divide the data into portions, prioritize the order
in which different portions are transmitted, and include a protocol for specifying
transmission order to the receiver. The prioritization process can aim at different
goals, such as reducing the MSE or improving the visual appearance of the
reconstructed image at each step.
In Ref. [11] the author assigns to this class the spectral selection method of
the JPEG standard (Section 9.4.1.1), Efstratiadi’s Filter Bank Coder [24], and
several block-based spatial domain coders [25,26].
9.3.3 Comparing Progressive Transmission Schemes
Although it is possible to compare different progressive transmission
schemes [11], no general guidelines exist. However, the following broad
statements can be made:.
SOME EXAMPLES 251
• The application ﬁeld and the characteristics of the data might directly
determine the most appropriate transmission approach. In some cases,
starting from a thumbnail (which progressively increases in size) is
preferable to receiving, for example, the low-contrast full-size image
produced by a bit plane successive approximation algorithm (which becomes
progressively sharper) or the blocky full-size image produced by a multistage
vector quantizer. In other cases, the thumbnail is the least useful of the three
representations. There is no mathematical way of characterizing which of
the three images is more suitable for a speciﬁc application domain.
• The theory of successive reﬁnements provides a reference point to assess
and compare bandwidth requirements. It can be very useful for comparing
similar approaches but might not be effective for comparing heterogeneous
methods, such as a multiresolution scheme and a successive approximation

algorithm. Additionally, different compression methods produce different
types of artifacts in intermediate images; the lack of a distortion measure
matched to the human visual system means that an intermediate image with
certain artifacts can be substantially preferable to another with different
artifacts, even though their measured distortion is the same.
• The speciﬁc application often imposes constraints on the characteristics of
the associated compression algorithm and consequently reduces the set of
possible candidate transmission schemes. For instance, some applications
might require lossless compression, whereas others can tolerate loss; more-
over, different lossy compression algorithms introduce different artifacts,
whose severity depends on the intended use of the data.
9.4 SOME EXAMPLES
9.4.1 Transform Coding
Because raw digital images are typically difﬁcult to compress, one often resorts
to representing the image in another mathematical space (see Chapter 8). The
Fourier transform (FT, Chapter 16), the discrete cosine transform (DCT, Chap-
ters 8 and 16 ), and the wavelet transform (WT, Chapters 8 and 16) are examples
of transformations that map images into mathematical spaces having properties
that facilitate compression and transmission. They represent images as linear
combinations of appropriately selected functions called basis functions.
The DCT and WT in particular are at the heart of numerous progressive
transmission schemes. The next two sections analyze them in detail.
9.4.1.1 Discrete Cosine Transform Coding. The DCT uses cosines of varying
spatial frequencies as basis functions. Lossy JPEG uses a particular type of DCT
called block-DCT: the image is ﬁrst divided into nonoverlapping square blocks of
size 8 × 8 pixels, and each tile is independently transformed with DCT. In JPEG,
the reversible DCT step is followed by irreversible quantization of the DCT coef-
ﬁcients. The table used for quantization is not speciﬁed by the standard and, in
252 TRANSMISSION OF DIGITAL IMAGERY
principle, could be selected to match the requirements of speciﬁc applications.

In practice, there is no algorithm to accomplish this task and even the quantiza-
tion tables recommended in the standard for use with photographic images were
constructed by hand, using essentially a trial-and-error approach.
DCT coding supports progressive transmission easily, if not optimally.
Progressive JPEG is now supported in many Web browsers. There are two main
progressive JPEG modes.
The ﬁrst progressive transmission mode uses a successive approximation
scheme. During the ﬁrst pass a low-precision approximation of all the DCT
coefﬁcients is transmitted to the client. Using this information the client
can reconstruct a full-sized initial image, but one that will tend to have
numerous artifacts. Subsequent transmissions successively reduce errors in the
approximation of the coefﬁcients and the artifacts in the image. The rationale
for this approach is that the largest coefﬁcients capture a great deal of the image
content even when transmitted at reduced precision.
The second method is a spectral selection approach which belongs to the
transmission-sequence category. First the server sends the DC pixels (corre-
sponding to spatial frequency equal to zero) of each image block to the client.
The client can then reconstruct and display a scaled-down version of the original
image, containing one pixel per each 8
2
block. Each pixel contains the average
value of the 8
2
block. In each subsequent transmission stage the server sends
the AC coefﬁcients required to double the size of the received image, until all
the data is received by the client. This scheme is an effective way of generating
thumbnail images without redundant transmission of bits. The thumbnail can be
used to select, say, the desired image, and at the beginning of the transmission the
client can use the thumbnail to populate the DC pixel locations of an otherwise
empty DCT buffer. The server begins the full transmission not by repeating the

DC values, which the client already has, but starts right in with the ﬁrst scan of
low-frequency pixels. The main problem with using DCTs in a tiled image for
progressive transmission is that the tile size imposes a minimum resolution on
the image as a whole. That is, every 8
2
tile contributes to the ﬁrst scan regardless
of its similarity to its neighbors. Bandwidth is used to express the DC level of
each tile, even when that tile’s DC level may be shared by many of its neighbors:
the basis functions of the DCT do not exploit coherence on scales larger than 8
2
.
9.4.1.2 Integer Wavelet Transform Coding. Using wavelet transforms for
image processing and compression is a subject that is both vast and deep. We will
not delve deeply into the mathematics or algorithms of wavelet transformations
but refer the reader to Chapter 8 for an introduction and for references.
In this section, we restrict ourselves to fast, exactly reversible, integer-based
transforms. It turns out that a number of different wavelet bases meet these
criteria. Therefore, we will discuss in generic terms what such transforms look
like, what properties they lend to progressive transmission, and how they optimize
the use of transmission bandwidth.
SOME EXAMPLES 253
In a generic integer wavelet transform, the ﬁrst pass scans through the image
and forms nearest-neighbor sums and differences of pixels. On a second pass, it
scans through what is essentially a 2 × 2 block average of the original image, one
whose overall dimensions have been reduced by a factor of 2. After forming the
nearest-neighbor sums and differences, the third pass then operates on a 4 × 4
block averaged version of the original. This hierarchical decomposition of the
image lends the wavelet transform its special powers. As the data are transmitted
by bit plane, all wavelet coefﬁcients that capture details too small or faint to be
of interest at a certain stage are equal to zero.

A characteristic appearance of an image that has been wavelet-inverted using
only partially received coefﬁcients is the variation in pixel size: pixels are big
where the image is not changing very much and they are small near edges and
small-scale features where the image is changing rapidly. This is a truly dynamic
bandwidth management. Bandwidth is used just where it is needed (expressing
small pixels) and is conserved where not needed (the large, ﬁxed areas in an
image). It is also automatic, in that it falls naturally out of the transform. No
special image processing, such as edge detection or content determination, is
required.
There are an endless variety of integer-based wavelet transforms. They vary
according to small differences in how they calculate their nearest-neighbor sums
and differences. It turns out that there is not a large difference in performance
between them. Figure 9.4 shows the performance of different transform coders,
measured as mean-square error as a function of transmitted data volume. For our
purposes, the horizontal axis is proportional to elapsed transmission time for a
constant bandwidth.
We notice several facts. First, the DCT coder is obviously worse during the
early stages of the transmission. Second, the family of integer-based wavelet
coders perform pretty much alike. The differences between them are always
much smaller than the difference between their envelope and the performance
of the DCT coder. Note that the DCT coder used for Figure 9.4 is a bit plane
coder, not the pixel-ordering coders that are ofﬁcially sanctioned by the JPEG
standard.
9.4.2 Color Images
Color images provide an extra point of attack on bandwidth usage: we can exploit
the correlation in content between the separate red, green, and blue color compo-
nents. Any good ﬁle compressor, like JPEG, exploits this correlation between
colors as well, but it has particular relevance in progressive transmission.
Color images are typically expressed using three component images, one each
in the colors of red, green, and blue (RGB). Often, there is a strong correlation

in image content between each of the color bands. A classic method of dealing
with this correlation, used for decades in color TV broadcasting, is to switch the
scene into a different color space using the transformation
Y = 0.299R + 0.587G + 0.114B
254 TRANSMISSION OF DIGITAL IMAGERY
10
4
1000
100
10
1
0.1
0.01
0.001
Quality (zero = perfect)
0.01
0.1 1 10 100
No transform
DCT transform
Haar transform
S + P transform
TS transform
W22S transform
W3M transform
Percentage of image received
Figure 9.4. Mean-squared error as a function of fraction of image received.
Cb = 128 − (0.167734R) − (0.331264G) + (05B)
Cr = 128 + (0.5R) − (0.418688G) − (0.081312B)
and reverse the transformation when displaying the image (There are many
variants on this transformation; we use this one to illustrate the idea.) The Y-

component is called the luminance and can be used as a gray scale representation
of the original image. This signal is used by black and white TV sets when
receiving a color signal. The other two components represent color differences
and are much less correlated in image content. These “chroma” channels are
often very compressible.
A good progressive image-transmission system should exploit such correla-
tions between the components of multiband imaging. Such correlation is not
restricted to RGB systems, for example, multiband spectral imaging detectors
in Earth-observing systems often produce highly correlated image components.
There may be multiple groups of correlated components, with the number of
correlated spectral bands in each group being much larger than three. A transfor-
mation similar to that mentioned earlier can be designed for multiband imagery
with any number of components.
SOME EXAMPLES 255
9.4.3 Extraction, Coding, and Transmission of Data
We will now step through an actual progressive transmission and discuss the
details of the process. Our intent is to outline the steps and considerations any
good progressive image-transmission system should have, instead of exhaustively
documenting any speciﬁc process or step.
The server ﬁrst ingests the image. The image can be transformed either on-the-
ﬂy or in advance. The choice depends on whether the server has been supplied
with preprocessed, previously transformed images. Preprocessing is used to avoid
the CPU processing required to perform the transformation in real time.
Wavelet transforms produce negative numbers for some pixels, and even if
the absolute value of the coefﬁcient is small, sign extension in 32-bit or 64-
bit integers can generate many unneeded 1-bits that can foil the entropy coders
(Chapter 8). Signs must be handled separately, usually by separating them from
their associated values and dealing with a completely nonnegative transform.
Signs must be carefully reunited with their associated coefﬁcients at the client in
time for them to express the desired effect before the client inverts the transform.

The server then extracts bit planes, or blocks of pixels, and performs entropy
coding on them to reduce their size. Entropy coding may be done in advance as
well, although this complicates the serving of regions of interest. If the progres-
sive transmission is interactive, the server may need to reverse the entropy coding
and rebuild the image in order to extract the requested region of interest.
Bit clipping is an important part of the interactive mode of progressive trans-
mission. The client may request overlapping regions of interest and the degree of
overlap may be quite high. For example, the client may be cropping an image and
the crop may be slightly wrong; recropping with just a small change in size or
position can result in a great deal of overlap. Because the goal of serving regions
of interest is to represent the requested region exactly (losslessly), there can be a
large transmission penalty in mismanaging such regions. A well-behaved inter-
active server will keep track of the bit plane sections and pixel blocks already
sent and will clip (zero) out duplicate bits before sending.
In an interactive session, it is important to quantize the transmitted data
messages into reasonably timed units. That is, when sending a bit plane that
may take, for example, a minute to complete, the client may be blocked from
requesting and receiving regions of interest before that time. The transactions
must be quantized such that the server can respond to interactive requests and
interpolate the transmission of the newly requested areas before falling back to
the originally scheduled bits.
9.4.4 Examples
A progressive image-transmission system that meets each of the ideal capabilities
of Section 9.1 has been implemented at the University of Wisconsin. Here we
show some examples using this system.
Figure 9.5 shows a true-color (24-bit) image progressively transmitted to a
byte count that is only 0.1 percent of the original. This represents a compression
256 TRANSMISSION OF DIGITAL IMAGERY
ratio of nearly 1,000. Note the effect of the dynamic spatial resolution: the pixels
are small and dense around the edge and face of the child but are large and

sparse in the background, where the ﬁle cabinets present a visually uninteresting
background. Although a detailed inspection indicates that there is much to be
improved, holding the picture at arm’s length nevertheless shows that this point
in the progression serves nicely as a thumbnail image.
Figure 9.6 shows the same image as in Figure 9.5, but captured at a factor of
300 in lossiness. Considerable improvement is seen in ﬁngers, forehead, and hair.
In Figure 9.7 we jump all the way to a factor of 80 in lossiness. Remember
that this is still only about 1 percent of the total image volume. We use examples
at such aggressive levels of lossiness because of the high performance of the
algorithm; at the 10 percent point, the image is visually indistinguishable from
the original and no further improvement is apparent.
Finally, in Figure 9.8 we show the effects of selecting regions of interest.
In this case, the approximation of Figure 9.5 was used to outline a box using
the mouse in our graphical user interface. Upon selection, the region of interest
was immediately extracted from the wavelet transform, previously sent bits were
clipped (zeroed out) to avoid retransmission, and the region was progressively
sent to the client.
Figure 9.5. Result of sending 0.1 percent of the coefﬁcients. A color version of this ﬁgure
can be downloaded from />tech med/image databases.
SOME EXAMPLES 257
Figure 9.6. Result of sending 0.3 percent of the coefﬁcients. A color version of this ﬁgure
can be downloaded from />tech med/image databases.
Figure 9.7. Result of sending 1 percent of the coefﬁcients. A color version of this ﬁgure
can be downloaded from />tech med/image databases.
258 TRANSMISSION OF DIGITAL IMAGERY
Figure 9.8. Selecting a region of interest in an image: the region of interest is extracted
from the wavelet transform, previously sent bits are clipped to avoid retransmission,
and the region is progressively sent to the client. A color version of this ﬁgure can be
downloaded from />tech med/image databases.
9.5 SUMMARY

Progressive transmission of large digital images offers the powerful new capa-
bility of interactively browsing large image archives over relatively slow network
connections. Browsers can enjoy an effective bandwidth that appears as much as
one hundred times greater than the actual bandwidth delivered by the network
infrastructure.
Interactivity allows users to examine an image very quickly and then either
reject it as a candidate for complete transmission or outline regions of interest
that quickly follow at a higher priority.
Progressive transmission is a form of image compression in which lossiness
is played out as a function of elapsed time rather than ﬁle size as in normal
compression. No degree of lossiness need be chosen in advance, and different
clients can customize the quality of the received image simply by letting more
time pass during its receipt.
Even when fully transmitting a lossless image, progressive transmission is still
more efﬁcient than transmitting the raw image, because the transform and entropy
coding at its core are the same as the ones used in normal ﬁle compression.
REFERENCES
1. W. Pennebaker and J.L. Mitchell, JPEG Still Image Data Compression Standard,Van
Nostrand Reinhold, New York, 1993.
REFERENCES 259
2. R. White, High performance compression of astronomical images, Technical Report,
Space Telescope Science Institute, 1992.
3. T. Berger, Rate Distortion Theory: a Mathematical Basis for Data Compression,
Prentice Hall, Englewood Cliffs, N.J., 1971.
4. W. Equitz and T. Cover, Successive reﬁnements of information, IEEE Trans. Infor-
mation Theory 37, 269–275 (1991).
5. J. Chowan and T. Berger, Failure of Successive Reﬁnement for Symmetric Gaussian
Mixtures, IEEE Trans. Information Theory 43, 350–352, (1991).
6. B. Rimoldi, Successive reﬁnements of information: Characterization of the Achiev-
able Rates, IEEE Trans. Information Theory 40, 253–259 (1994).

7. J. Chow, Interactive selective decompression of medical images, Proc. IEEE Nucl.
Sci. Symp. 3, 1855– 1858 (1996).
8. L. Lastras and T. Berger, All sources are nearly successively reﬁnable, IEEE Trans.
Inf. Theory (2001), in press.
9. M. Effros, Distortion-rate bounds forﬁxed- and variable-rate multiresolution source
codes, IEEE Trans. Inf. Theory 45, 1887– 1910 (1999).
10. K H. Tsou, Progressive image transmission: a review and comparison of techniques,
Opt. Eng. 26(7), 581– 589 (1987).
11. Y K. Chee, A survey of progressive image transmission methods, Int. J. Imaging
Systems Techno. (2000).
12. B H. Juang and A. Gray, Multiple stage vector quantization for speech coding, Proc.
IEEE ICASSP 82 1, 597–600 (1982).
13. L. Wang and M. Goldberg, Progressive image transmission by transform coefﬁcient
residual error quantization, IEEE Trans. Commun. 36, 75– 87 (1988).
14. R. Blanford, Progressive reﬁnement using local variance estimators, IEEE Trans.
Commun. 41(5), 749–759 (1993).
15. G. Sullivan and R. Baker, Efﬁcient quadtree coding of images and video, IEEE Trans.
Image Process. 3, 327–331 (1994).
16. K. Knowlton, Progressive transmission of grey-scale and binary pictures by simple,
efﬁcient, and lossless encoding schemes, Proc. IEEE 68(7), 885 –896 1980.
17. S. Tanimoto, Image transmission with gross information ﬁrst, Technical Report 77-
10-06, Department of computer science, University of Washington, Seattle, 1977.
18. M. Goldberg and L. Wang, Comparative performance of pyramid data structures for
progressive image transmission, IEEE Trans. Commun. 39(4), 540– 547 (1991).
19. M. Viergever and P. Roos, Hierarchical interpolation, IEEE Eng. Med. Bio. 12, 48–55
(1993).
20. P. Burt and A.E.H., The Laplacian pyramid as a compact image coder, IEEE Trans.
Commun. 31(4), 532–540, (1983).
21. S. Park and S. Lee, The coding gains of pyramid structures in progressive image
transmission, Proc. SPIE Vis. Commun. Image Process.’90 1360, 1702– 1710 (1990).

22. P. Westerink, J. Biemond, and D. Boekee, Progressive transmission of images using
image coding, Proc. IEEE ICASSP 89 1811 –1814 (1989).
23. E. Riskin, R. Ladner, R Y. Wang, and L. Atlas, Index assignment for progressive
transmission of full-search vector quantization, IEEE Trans. Image Process., 3(3),
307–311 (1994).
260 TRANSMISSION OF DIGITAL IMAGERY
24. S. Efstratiadis, B. Rouchouze, and M. Kunt, Image compression using subband/
wavelet transform and adaptive multiple-distribution entropy coding, Proc. SPIE Vis.
Commun. Image Process. ’92 1818, 753–764 (1992).
25. N. Subramanian, A. Kalhan, and V. Udpikar, Sketch and texture coding approach to
monochrome image coding, Int. Conf. Image Process. Appl., 29–32 FIND (1992).
26. S. Caron and J F. Rivest, Progressive image transmission by segmentation- based
coding, J. Vis. Commun. Image Represent. 7(3), 296 –303 (1996).

Tài liệu Cơ sở dữ liệu hình ảnh P9 pdf

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về