Tải bản đầy đủ (.pdf) (293 trang)

topological algorithms for digital image processing

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (14.62 MB, 293 trang )

PREFACE
Objects in three dimensions, and their two-dimensional images, are approximated dig-
itally by sets of voxels ("volume elements") or pixels ("picture elements"), respectively.
Digital geometry is the study of geometric properties of digitized objects (or digitized
images of objects); it deals both with the definitions of such properties and with algo-
rithms for their computation. In particular, digital topology deals with properties of a
"topological" nature (particularly, properties that involve the concepts of connectedness
or adjacency, but do not depend on size or shape), and with algorithms that compute
or preserve such properties. Topological properties and algorithms play a fundamental
role in the analysis of two- and three-dimensional digital images. This book deals with
basic topological algorithms; it presents their underlying theory and also discusses their
applications.
An object is always understood to be (arcwise) connected, and the same is therefore
true for images of the object obtained from any viewpoint. Thus if a three- (or two-)
dimensional digital image can be segmented into "object" and "background" voxels (or
pixels),
the connected components of the object voxels or pixels are the individual objects
(or their images). Connected component labeling is the process of assigning a distinct label
to the voxels (pixels) that belong to each distinct object. The first chapter, by Shapiro,
defines the problem of connected component labeling and gives sequential and parallel
solutions, including efficient sequential algorithms (due to Lumia et al.) for labeling
connected components in both two- and three-dimensional digital images. An algorithm
for constructing the graph representing the pairwise adjacencies of the components is also
presented. An appendix to this chapter, coauthored by the editors, provides a simple
proof of the correctness of Lumia's algorithms.
The second chapter, coauthored by Hall and the editors, discusses shrinking algorithms,
which reduce the sizes of the components in an image. Shrinking to a topological equivalent
reduces the number of object pixels while preserving the topology of the image (i.e., not


changing the connectivity properties of the objects or the background). In shrinking
to a residue, holes in objects are not preserved, but each object is shrunk to a single
isolated pixel (called a residue) which may then be deleted. The chapter discusses parallel
algorithms for both kinds of shrinking, but focuses on shrinking to a residue.
Many important classes of two-dimensional "objects" are composed mostly or entirely
of elongated parts; for example, alphanumeric characters are composed of strokes. The
representation of such an object by a set of pixels can be simplified by a process known
as thinning, which reduces the elongated parts to one-pixel-thick arcs or closed curves,
without changing the connectivity properties of the object or of its background. (Note that
elongatedness is not a topological property, but thinning is a topology-preserving process.)
The result of thinning a two-dimensional object is usually called the skeleton of the object.
The third chapter, by Arcelli and Sanniti di Baja, reviews a variety of skeletonization
methods, with emphasis on the adequacy with which the branches (constituent arcs or
curves) of a skeleton represent the elongated parts of the original object. It is easy to
insure that a skeletonization process preserves topology if the process is sequential (e.g.,
it deletes pixels from an object one at a time), but more difficult if the process is highly
parallel; the fourth chapter, by Hall, discusses parallel thinning algorithms and methods
of proving that they preserve topology. Neither of these chapters treats thinning of three-
dimensional objects, which is a more complicated subject; note that three-dimensional
objects can have two kinds of elongated parts, "stick-like" (which can be thinned to arcs
or curves) and "plate-like" (which can be thinned to one-voxel-thick "sheets").
The use of thin ("sheet-like") connected sets of voxels to represent surfaces in three-
dimensional Euclidean space, such as planes and spheres, is considered in the fifth chapter,
by Cohen-Or, Kaufman and Kong. They state precise conditions (some of which are due
to Morgenthaler and Rosenfeld) under which a set of voxels might be regarded as an
adequate representation of a mathematically defined surface, such as a plane specified by
an equation.
Assuming that voxels are defined as unit cubes, surfaces can also be represented by
sets of voxel faces (rather than by sets of voxels). This representation is quite natural for
surfaces that arise as boundaries of three-dimensional objects, and is readily generalized

to boundaries of n-dimensional "hyperobjects". The sixth chapter, by Udupa, describes
algorithms for extracting, labeling, and tracking boundaries represented in this way, in
any number of dimensions. The seventh chapter, by Herman, develops a general theory of
boundaries in abstract digital spaces, and shows that basic properties of connectedness and
separatedness of the interiors and exteriors of boundaries can be established in this general
framework. Fundamental soundness properties of algorithms such as those described in
Udupa's chapter can be deduced from special cases of results in Herman's chapter.
Because of the wide variety of topics treated in the seven chapters, we have not at-
tempted to standardize the notation and terminology used by their authors. However,
each chapter is self-contained and can be read independently of the others. Some of
the basic terminology and fundamental concepts of digital topology are reviewed in the
appendix, which also briefly describes important areas of the field and provides a bibliog-
raphy of over 360 references. The notations and terminologies used in this book will serve
to introduce its readers to the even wider variety that exists in the voluminous literature
dealing with topological algorithms.
T. Yung Kong
Queens, New York
Azriel Rosenfeld
College Park, Maryland
Topological Algorithms for Digital Image Processing
T.Y. Kong and A. Rosenfeld (Editors)
© 1996 Elsevier Science B.V. All rights reserved.
Connected Component Labeling and Adjacency Graph
Construction
Linda G. Shapiro^
^Department of Computer Science and Engineering, University of Washington, Seattle,
Washington 98195
Abstract
In machine vision, an original gray tone image is processed to produce features that
can be used by higher-level processes, such as recognition and inspection procedures.

Thresholding the image results in a binary image v^hose pixels are labeled as foreground or
background. Segmenting the image results in a symbolic image v^hose pixels are assigned
labels representing various classifications. In both cases, an important next step in the
analysis of the image is an operation called connected component labeling that groups the
pixels into regions, such that adjacent pixels have the same label, and pixels belonging to
distinct regions have different labels. Properties of the regions and relationships among
them may then be calculated. The most common relationship, spatial adjacency, can be
represented by a region adjacency graph. This chapter describes algorithms for connected
component labeling and region adjacency graph construction. In addition to giving several
sequential algorithms for two-dimensional connected component labeling, it also discusses
several parallel algorithms and an algorithm for three-dimensional connected component
labeling.
1.
INTRODUCTION
Decomposition of an image into regions is a widely used technique in machine vision.
In many cases, the original gray tone image can be thresholded to produce a binary im-
age whose pixels are labeled as foreground or background. Objects of interest are the
connected regions composed of the foreground pixels. For example, in character recogni-
tion, the image pixels comprising the characters form the foreground and the remaining
pixels the background. The general procedure is to produce the binary image, determine
the connected regions of its foreground pixels, calculate properties of these regions, and
apply a decision procedure to classify regions or select regions of interest. In character
recognition, the goal is to classify each of the characters.
In other applications, such as aerial image analysis, the image may be segmented into
many regions representing many different types of entities, such as buildings, streets,
grassy areas and forests. Each separate type of entity is represented by a different label
and the points of that entity are all assigned that label by the segmentation process. The
result is called a symbolic image. In this case, the general processing paradigm includes
segmenting the original image, determining the connected regions of pixels having each
label, calculating properties of each region, calculating the spatial relationships among

the regions, and applying a decision procedure to classify regions or sets of regions of
interest.
In both the binary case and the more general case, the analysis begins with an operation
called connected component labeling that groups adjacent image pixels that have the same
label into regions. In the general case, many different relationships among the regions
can be calculated. The most common relationship is spatial adjacency, which can be
represented by a graph structure called a region adjacency graph. This chapter describes
the algorithms required to perform connected component labeling and to construct the
region adjacency graph.
2.
CONNECTED COMPONENT LABELING
Let us call the foreground pixels of a binary image the black pixels and the background
pixels the white pixels. Connected component analysis consists of connected component
labeling of the black pixels followed by property measurement of the component regions
and decision making. The connected component labeling operation changes the unit of
analysis from pixel to region or segment. All black pixels that are connected to each
other by a path of black pixels are given the same identifying label. The label is a unique
name or index of the region to which the pixels belong. A region has shape and position
properties, statistical properties of the gray levels of the pixels in the region, and spatial
relationships to other regions.
2.1.
STATEMENT OF THE PROBLEM
Once a gray level image has been thresholded to produce a binary image, a connected
component labeling operator can be employed to group the black pixels into maximal
connected regions. These regions are called the (connected) components of the binary
image, and the associated operator is called the connected components operator. Its input
is a binary image and its output is a symbolic image in which the label assigned to each
pixel is an integer uniquely identifying the connected component to which that pixel
belongs. Figure 1 illustrates the connected components operator as applied to the black
pixels of a binary image.

Two black pixels p and q belong to the same connected component C if there is a
sequence of black pixels (pcPi,
• • •
,Pn) of C where po = p, Pn = q^ and pi is a neighbor of
Pi-i (see Figure 2) for z = 1, , n. Thus the definition of a connected component depends
on the definition of neighbor. When only the north, south, east, and west neighbors of
a pixel are considered to be in its neighborhood, then the resulting regions are called
4-connected. When the north, south, east, west, northeast, northwest, southeast, and
southwest neighbors of a pixel are considered to be in its neighborhood, the resulting
regions are called
8-connected.
Whichever definition is used, the neighbors of a pixel are
said to be adjacent to that pixel. The border of a connected component of black pixels
is the subset of pixels belonging to the component that are adjacent to white pixels (by
whichever definition of adjacency is being used for the black pixels).
Rosenfeld (1970) has shown that if C is a component of black pixels and D is an
adjacent component of white pixels, and if 4-connectedness is used for black pixels and
0
0
1
0
0
0
0
1
1
1
0
1
1

1
1
1
1
0
0
1
1
0
0
0
0
0
1
1
(a)
1
1
1
1
0
1
0
0
0
0
1
0
1
0

0
1
1
1
0
0
0
0
0
1
0
0
0
0
1
1
1
0
3
3
3
1
1
1
0
0
3
3
0
0

0
0
0
3
3
(b)
2
2
2
2
0
3
0
0
0
0
2
0
3
0
0
2
2
2
0
0
0
Figure 1. Application of the connected components operator to a binary image; (b)
Symbolic image produced from (a) by the connected components operator.
• •••

• •••
(a)
(b)
Figure 2. (a) Pixels, •, that are 4-neighbors of the center pixel x; (b) pixels, •, that are
8-neighbors of the center pixel x.
8-connectedness is used for white pixels, then either C surrounds D (D is a hole in C)
or D surrounds C (C is a hole in D). This is also true when 8-connectedness is used for
black pixels and 4-connectedness for white pixels, but not when 4-connectedness is used
for both black pixels and white pixels and not when 8-connectedness is used for both
black pixels and white pixels. Figure 3 illustrates this phenomenon. The surroundedness
property is desirable because it allows borders to be treated as closed curves. Because
of this, it is common to use one type of connectedness for black pixels and the other for
white pixels .
The connected components operator is widely used in industrial applications where
an image often consists of a small number of objects against a contrasting background;
see Section 4 for examples. The speed of the algorithm that performs the connected
components operation is often critical to the feasibility of the application. In the next
three sections we discuss several algorithms that label connected components using two
sequential passes over the image.
All the algorithms process a row of the image at a time. Modifications to process a
rectangular window subimage at a time are straightforward. All the algorithms assign
new labels to the first pixel of each component and attempt to propagate the label of a
black pixel to its black neighbors to the right or below it (we assume 4-adjacency with a
left-to-right, top-to-bottom scan order). Consider the image shown in Figure 4. In the
first row, two black pixels separated by three white pixels are encountered. The first is
0
0
0
0
0

0
0
1
0
0
0
1
0
1
0
(a)
0
0
1
0
0
0
0
0
0
0
a
a
a
a
a
a
a
2
a

a
a
1
b
4
a
a
a
3
a
a
a
a
a
a
a
(b)
a
a
a
a
a
a
a
1
a
a
a
1
a

1
a
a
a
1
a
a
a
a
a
a
a
a
a
a
a
a
a
a
1
a
a
a
1
b
1
a
a
a
1

a
a
a
a
a
a
a
(c) (d)
Figure 3. Phenomena associated with using 4- and 8-adjacency in connected component
analyses. Numeric labels are used for components of black pixels and letter labels for
white pixels, (a) Binary image; (b) connected component labeling with 4-adjacency used
for both white and black pixels; (c) connected component labeling with 8-adjacency used
for both white and black pixels; and (d) connected component labeling with 8-adjacency
used for black pixels and 4-adjacency used for white pixels.
assigned label 1; the second is assigned label 2. In row 2 the first black pixel is assigned
label 1 because it is a 4-neighbor of the already-labeled pixel above it. The second black
pixel of row 2 is also assigned label 1 because it is a 4-neighbor of the already-labeled pixel
on its left. This process continues until the black pixel marked A in row 4 is encountered.
Pixel A has a pixel labeled 2 above it and a pixel labeled 1 on its left, so that it connects
regions 1 and 2. Thus all the pixels labeled 1 and all the pixels labeled 2 really belong
to the same component; in other words, labels 1 and 2 are equivalent. The differences
among the algorithms are of three types, as reflected in the following questions.
1.
What label should be assigned to pixel A?
2.
How does the algorithm keep track of the equivalence of two (or more) labels?
3.
How does the algorithm use the equivalence information to complete the processing?
2.2.
THE CLASSICAL ALGORITHM

The classical algorithm, deemed so because it is based on the classical connected compo-
nents algorithm for graphs, was described in Rosenfeld and Pfaltz (1966). This algorithm
makes only two passes through the image but requires a large global table for recording
equivalences. The first pass performs label propagation, as described above. Whenever a
situation arises in which two different labels can propagate to the same pixel, the smaller
0
0
0
0
0
0
0
0
1
1
1
1
0
1
1
1
0
0
1
1
0
0
0
1
1

1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
0
1
1
1
0
0
1
1
0
0
0
1
2
2
2

A
(a)
(b)
Figure 4. Propagation process. Label 1 has been propagated from the left to reach
point A. Label 2 has been propagated down to reach point A. The connected components
algorithm must assign a label to A and make labels 1 and 2 equivalent. Part (a) shows
the original binary image, and (b) the partially processed image.
label propagates and each such equivalence found is entered in an equivalence table. Each
entry in the equivalence table consists of an ordered pair, the values of its components
being the labels found to be equivalent. After the first pass, the equivalence classes are
found by taking the transitive closure of the set of equivalences recorded in the equiva-
lence table. Each equivalence class is assigned a unique label, usually the minimum (or
oldest) label in the class. Finally, a second pass through the image performs a translation,
assigning to each pixel the label of the equivalence class of its pass-1 label. This process
is illustrated in Figure 5, and an implementation of the algorithm is given below.
procedure CLASSICAL
"Initiahze global equivalence table."
EQTABLE := CREATE( );
"Top-down pass 1"
for L := 1 to NROWS do
"Initiahze aU labels on row L to zero."
for P := 1 to NCOLS do
LABEL(L,P) := 0
end for;
"Process the row."
for P := 1 to NCOLS do
if I(L,P) = 1 then
begin
A := NEIGHBORS({L,P));
if ISEMPTY(A)

then M := NEWLABEL( )
else
begin
M := MIN(LABELS(A));
for X in LABELS(A) and X <> M do
ADD(X, M, EQTABLE)
end for;
end
1 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1 1 1 1 1 1 1
(a)
10
12
12
12
12
12

12
12
10
12
12
7
7
7
7
7
7
12
7
12
3
7
12
3
8
8
8
8
8
7
13
12
3
8
8
8

8
8
7
13
12
3
3
7
13
12
2
2
2
3
3
2
9
9
9
9
9
7
13
12
2
2
2
2
2
2

5
5
7
13
12
2
2
2
2
13
12
2
2
11
13
12
2
4
4
2
6
6
11
11
13
12
2
2
2
2

2
2
2
2
2
2
2
2
2
2
2
12
2
2
2
2
12 12 12 12
1
12
1
12
1
12
(b)
Figure 5. Classical connected component labeling algorithm: Part (a) shows the initial
binary image, and (b) the labeling after the first top down pass of the algorithm. The
equivalence classes found are 1: {
1,12,7,8,9,10,5
} and 2: { 2,3,4,6,11,13 }.
LABEL(L,P) := M;

end for
end for;
"Find equivalence classes."
EQCLASSES := RESOLVE(EQTABLE);
for E in EQCLASSES do
EQLABEL(E) := MIN(LABELS(E))
end for;
"Top-down pass 2"
for L := 1 to NROWS do
for P := 1 to NCOLS do
if I(L,P) = 1
then LABEL(L,P) := EQLABEL(CLASS(LABEL(L,P)))
end for
end for
end CLASSICAL
The notation NEIGHBORS((L,P)) refers to the already-encountered
1-valued
neighbors
of pixel (L,P). The algorithm referred to as RESOLVE is simply the algorithm for finding
the connected components of the graph defined by the set of equivalences (EQTABLE)
defined in pass 1. The nodes of the graph are region labels, and the edges are pairs
of labels that have been found to be equivalent. The procedure, which uses a standard
depth-first search algorithm, can be stated as follows:
procedure RESOLVE(EQTABLE);
list_of_components := nil;
for each node N in EQTABLE
if N is unmarked then
begin
current-component := DFS(N,EQTABLE);
add_to_list(Ust-of_components,current-component)

end
end for;
return LIST_OF_COMPONENTS
end RESOLVE
In this procedure, Ust-of-components is a list that will contain the final resultant equiv-
alence classes. The function DFS performs a depth-first search of the graph beginning
at the given node N and returns a list of all the nodes it has visited in the process. It
also marks each node as it is visited. A standard depth-first search algorithm is given in
Horowitz and Sahni (1982) and in most other data structures texts.
In procedure CLASSICAL, determination of the smallest equivalents of the labels in-
volves examining the labels in each equivalence class after all the equivalence classes have
been explicitly constructed. However, the smallest equivalent labels can be found a little
more efficiently by using the fact that if the labels in EQTABLE are processed in ascend-
ing order, then N is the smallest equivalent of each label visited by DFS(N,EQTABLE).
The complexity of the classical algorithm for an n x n image is 0{n'^) for the two passes
through the image, plus the complexity of RESOLVE which depends on the number of
equivalent pairs in EQTABLE. The main problem with the classical algorithm is the
global table. For large images with many regions, the table can become very large. On
some machines there is not enough memory to hold the table. On other machines that
use paging, the table gets paged in and out of memory frequently. For example, on a VAX
11/780 system with 8 megabytes of memory, the classical algorithm ran (including I/O)
in 8.4 seconds with 1791 page faults on one 6000 pixel image, but took 5021 seconds with
23,674 page faults on one 920,000-pixel image. This motivates algorithms that avoid the
use of the large global equivalence table for computers employing virtual memory.
2.3.
THE LOCAL TABLE METHOD: A SPACE EFFICIENT TWO-PASS
ALGORITHM
One solution to the space problem is the use of a small local equivalence table that
stores only the equivalences detected from the current row of the image and the row that
precedes it. Thus the maximum number of equivalences is the number of pixels per row.

These equivalences are then resolved, and the pixels are relabeled in a second scan of
the row; the new labels are then propagated to the next row. In this case not all the
equivalencing is completed by the end of the first (top-down) pass, and a second pass is
required for both finding the remainder of the equivalences and assigning the final labels.
The algorithm is illustrated in Figure 6. Note that the second pass is bottom-up. We
will give the general algorithm (Lumia, Shapiro, and Zuniga, 1983) and then describe, in
more detail, an efficient run-length implementation.
procedure LOCAL_TABLE_METHOD
"Top-down pass"
for L:=l to NROWS do
"Initialize local equivalence table for row L."
EQTABLE := CREATE( );
"Initialize all labels on row L to zero."
for P := 1 to NCOLS do
LABEL(L,P) := 0
end for;
"Process the row."
for P := 1 to NCOLS do
if I(L,P) = 1 then
begin
A := NEIGHBORS((L,P));
if ISEMPTY(A)
then M := NEWLABEL( )
else
begin
M := MIN(LABELS(A) );
for X in LABELS(A) and X <> M do
ADD (X,M, EQTABLE)
end for
end

7
5
5
5
5
5
5
1
7
5
1
7
7
7
7
7
5
1
5
1
•2
5
1
2
8
8
8
8
8
5

2
1
2
8
8
8
8
8
5
2
1
2
2
2
2
5
2
1
2
2
2
2
2
2
5
5
5
5
5
5

2
1
2
2
2
2
2
2
5
5
5
2
1
2
2
2
1
2
2
2
2
1
2
4
4
2
2
2
2
2

2
1
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
1
2
2
2
2
1 1 1
1
1
1
1
1
1
1

Figure 6. Results after the top-down pass of the local table method on the binary image
of Figure 5. Note that on the rows where equivalences were detected, the pixels have
different labels from those they had after pass 1 of the classical algorithm. For example,
on row 5 the four leading 3s were changed to 2s on the second scan of that row, after the
equivalence of labels 2 and 3 was detected. The bottom up pass will now propagate the
label 1 to all pixels of the single connected component.
LABEL(L,P) := M;
end
end for;
"Find equivalence classes detected on this row."
EQCLASSES := RESOLVE(EQTABLE);
for E in EQCLASSES do
EQLABEL(E) := MIN(LABELS(E))
end for;
"Relabel the parts of row L with their equivalence class labels."
for P := 1 to NCOLS do
if I(L,P) = 1
then LABEL(L,P) := EQLABEL(CLASS(LABEL(L,P)))
end for
end for;
"Bottom-up pass"
for L := NROWS-1 to 1 by -1 do
"Initiahze local equivalence table for row L."
EQTABLE := CREATE( );
"Process the row."
for P := 1 to NCOLS do
10
if LABEL(L,P) <>0 then
begin
LA := LABELS(NEIGHBORS(L,P));

for X in LA and X <> LABEL(L,P)
ADD (X,LABEL(L,P), EQTABLE)
end for
end
end for
"Find equivalence classes."
EQCLASSES := RESOLVE(EQTABLE);
for E in EQCLASSES do
EQLABEL(E) := MIN(LABELS(E))
end for
"Relabel the pixels of row L one last time."
for P := 1 to NCOLS do
if LABEL(L,P) <> 0
then LABEL(L,P) := EQLABEL(CLASS(LABEL(L,P)))
end for
end for
end LOCAL_TABLE_METHOD
Note that the set NEIGHBORS((L,P)) of already-processed
1-valued
neighbors of pixel
(L,P) is different in the top-down and bottom-up passes.
The complexity of the local table method for an n x n image is also O(n^) -f n times
the complexity of RESOLVE for one row of equivalences. However, in comparison with
the classical algorithm, the local table method took 8.8 seconds with 1763 page faults on
the 6000-pixel image, but only 627 seconds with 15,391 page faults on the 920,000-pixel
image, which is 8 times faster. For an even larger 5,120,000-pixel image, the local table
method ran 31 times faster than the classical method.
2.4.
AN EFFICIENT RUN LENGTH IMPLEMENTATION OF THE LOCAL
TABLE METHOD

In many industrial applications the image used is from a television camera and thus
is roughly 512 x 512 pixels, or 256K, in size. On an image half this size, the local table
method as implemented on the VAX 11/780 took 116 seconds to execute, including I/O
time.
But industrial applications often require times of less than one second. To achieve
this kind of efficiency, the algorithm can be implemented on a machine with some special
hardware capabilities. The hardware is used to rapidly extract a run-length encoding of
the image, and the software implementation can then work on the more compact run-
length data. Ronse and Devijver (1984) advocate this approach.
A run-length encoding of a binary image is a list of contiguous (typically, horizontal)
runs of black pixels. For each run, the location of the starting pixel of the run and either
its length or the location of its ending pixel must be recorded. Figure 7 shows the run-
length data structure used in our implementation. Each run in the image is encoded by
the locations of its starting and ending pixels. (ROW, START_COL) is the location of the
starting pixel and (ROW, END_COL) is the location of the ending pixel. PERM_LABEL
11
is the field in which the label of the connected component to which this run belongs will be
stored. It is initialized to zero and assigned temporary values in pass 1 of the algorithm.
At the end of pass 2, PERMXABEL contains the final, permanent label of the run. This
structure can then be used to output the labels back to the corresponding pixels of the
output image.
1
1
1
1
1
1
1
1
1

(a)
1
1
1
1
1
1
1
2
3
4
5
ROW-START
1
3
5
0
7
ROW-END
2
4
6
0
7
(b)
1
2
3
4
5

6
7
ROW
1
1
2
2
3
3
5
START-COL
1
4
1
5
1
5
2
END-COL
2
5
2
5
3
5
5
PERM-LABEL
0
0
0

0
0
0
0
(c)
Figure 7. Binary image (a) and its run-length encoding (b) and (c). Each run of black
pixels is encoded by its row (ROW) and the columns of its starting and ending pixels
(START-COL and END.COL). In addition, for each row of the image, ROW-START
points to the first run of the row and ROW_END points to the last run of the row. The
PERM-LABEL field will hold the component label of the run; it is initialized to zero.
Consider a run P of black pixels. During pass 1, when P has not yet been fully pro-
cessed, PERMXABEL(P) will be zero. After P has been processed and found to be
adjacent to some other run Q on the previous row, it will be assigned the current label
of Q, PERM-LABEL(Q). If it is found to be adjacent to other runs Qi, Q2,
• •
•,
QA'
also
on the previous row, then the equivalence of PERM-LABEL(Q), PERM-LABEL(Qi),
PERMXABEL(Q2), ,
PERM-LABEL(QK)
must be recorded. The data structures
used for recording the equivalences are shown in Figure 8. The use of simple linked lists
to store equivalences makes the assumption that there will be very few elements in an
equivalence class that is based on only two rows of the image. For storing, accessing, and
12
dynamically updating large equivalence classes, the union-find algorithm (Tarjan, 1975)
may be preferable.
1
2

3
Previous row
Current row
1
2
3
4
ROW
1
1
1
2
START_C0L
4
20
30
7
END_COL
10
24
38
35
PERM_LABEL
1
2
3
1
^
1
2

3
LABEL
1
1
1
NEXT
3
3
0
^—'
1
EQ_CLASS
1
Figure 8. Data structures used for keeping track of equivalence classes. In this example,
run 4 has PERMXABEL 1; this is an index into the LABEL array that gives the equiva-
lence class label for each possible PERM_LABEL value. In the example, PERM-LABELS
1,
2, and 3 have all been determined to be equivalent, so LABEL(l), LABEL(2), and LA-
BEL(3) all contain the equivalence class label, which is 1. Furthermore, the equivalence
class label is an index into the EQ-CLASS array that contains pointers to the begin-
nings of the equivalence classes; these are linked lists in the LABEL/NEXT structure.
In this example, there is only one equivalence class, class 1, and three elements of the
LABEL/NEXT array are linked together to form this class.
In our algorithm, PERM_LABEL(P), for a given run P, may be zero or nonzero. If it
is nonzero, then LABEL(PERM_LABEL(P)) may be zero or nonzero. If it is zero, then
PERM_LABEL(P) is the current label of the run and there is no equivalence class. If it is
non-zero, then there is an equivalence class and the value of LABEL(PERM_LABEL(P))
is the label assigned to that class. All the labels that have been merged to form this
class will have the same class label; that is, if run P and run P' are in the same class, we
should have LABEL(PERM.LABEL(P)) = LABEL(PERM_LABEL(P')). When such an

equivalence is determined, if each run was already a member of a class and the two classes
were different, the two classes are merged. This is accomplished by linking together each
prior label belonging to a single class into a linked list pointed to by EQ-CLASS(L) for
class label L and linked together using the NEXT field of the LABEL/NEXT structure.
To merge two classes, the last cell of one is made to point to the first cell of the other,
and the LABEL field of each cell of the second class is changed to reflect the new label
of the class.
In this implementation the minimum label does not always become the label of the
equivalence class. Instead, a single-member class is always merged into and assigned the
label of a multimember class. This allows the algorithm to avoid traversing the linked list
13
of the larger class to change all its labels when only one element is being added. When both
classes are single-member or both are multimember the label of the first class is selected
as the equivalence class field. The equivalencing procedure, MAKE_EQUIVALENT, is
given below.
procedure MAKE_EQUIVALENT (II, 12);
"II is the value of PERM_LABEL(R1) and 12 is the value of PERM_LABEL(R2) for
two different runs Rl and R2. They have been detected to be equivalent. The
purpose of this routine is to make them equivalent in the data structures."
case
LABEL(Il) = 0 and LABEL(I2) = 0:
"Both classes have only one member. Create a new class with
II as the label."
begin
LABEL(Il) := II;
LABEL(I2) := II;
NEXT(Il) := 12;
NEXT(I2) := 0;
EQ_CLASS(I1) := II
end;

LABEL(Il) = LABEL(I2):
"Both labels already belong to the same class."
return;
LABEL(Il) <> 0 and LABEL(I2) = 0:
"There is more than one member in the class with label II,
but only one in the class with label 12. So add the
smaller class to the larger."
begin
BEGINNING := LABEL(Il);
LABEL(I2) := BEGINNING;
NEXT(I2) := EQ_CLASS(BEGINNING);
EQ_CLASS(BEGINNING) := 12
end;
LABEL(Il) = 0 and LABEL(I2) <> 0:
"There is more than one member in the class with label 12,
but only one in the class with label II. Add the smaller class
to the larger."
begin
BEGINNING := LABEL(I2);
LABEL(Il) := BEGINNING;
NEXT(Il) := EQ_CLASS(BEGINNING);
EQ_CLASS(BEGINNING) := II
end;
LABEL (II) <> 0 and LABEL (12) <> 0:
"Both classes are multimember. Merge them by Unking the first
onto the end of the second, and assig|n label II."
begin
BEGINNING := LABEL(I2);
14
MEMBER := EQ_CLASS(BEGINNING);

EQ-LABEL := LABEL(Il);
while NEXT(MEMBER) <> 0 do
LABEL(MEMBER) := EQ_LABEL;
MEMBER := NEXT(MEMBER)
end while;
LABEL(MEMBER) := EQ_LABEL;
NEXT(MEMBER) := EQ_CLASS(EQ_LABEL);
EQ_CLASS(EQ_LABEL) := EQ_CLASS(BEGINNING);
EQ_CLASS(BEGINNING) := 0
end
end case;
end MAKE_EQUIVALENT
Using this procedure and a utility procedure, INITIALIZE_EQUIV,
which reinitializes the equivalence table for the processing of a new line,
the run length implementation is as follows:
procedure RUNJLENGTHJMPLEMENTATION
"Initialize PERM-LABEL array."
for R := 1 to NRUNS do
PERM_LABEL(R) := 0
end for;
"Top-down pass"
for L := 1 to NROWS do
P := ROW_START(L);
PLAST := ROW_END(L);
if L = 1
then begin Q := 0; QLAST := 0 end
else begin Q := R0W_START(L-1); QLAST := R0W_END(L-1) end;
if P <> 0 and Q <> 0
then
begin

INITIALIZE_EQUIV( );
"SCAN 1"
"Either a given run is connected to a run on the previous row or
it is not. If it is, assign it the label of the first run to which it
is connected. For each subsequent run of the previous row to which it
is connected and whose label is different from its own, equivalence its
label with that run's label."
while P< PLAST and Q < QLAST do
"Check whether runs P and Q overlap."
case
ENDXOL(P) < START_COL(Q):
"Current run ends before start of run on previous row"
P := P +1;
END_COL(Q) < START_COL(P):
"Current run begins after end of run on previous row."
15
Q:=Q + i;
else :
"There is some overlap between run P and run Q."
begin
PLABEL := PERM_LABEL(P);
case
PLABEL = 0:
"There is no permanent label yet; assign Q's label."
PERMXABEL(P) := PERM_LABEL(Q);
PLABEL <> 0 and PERM_LABEL(Q) <> PLABEL;
"There is a permanent label that is different from the
label of run Q; make them equivalent."
MAKE_EQUIVALENT(PLABEL, PERM_LABEL(Q));
end case;

"Increment P or Q or both as necessary."
case
END_COL(P) > END_COL(Q):
Q := Q+1;
END_COL(Q) > END_COL(P);
P := P + 1;
END_COL(Q) = END_COL(P):
begin Q := Q+1; P := P+1 end;
end case
end
end case
end while;
P := ROW_START(L);
end
"SCAN 2"
"Make a second scan through the runs of the current row.
Assign new labels to isolated runs and the labels of their
equivalence classes to all the rest."
if P <> 0 then
while P < PLAST do
begin
PLABEL := PERM_LABEL(P);
case
PLABEL = 0:
"No permanent label exists yet, so assign one."
PERM_LABEL(P) := NEWXABEL( );
PLABEL <> 0 and LABEL(PLABEL) <> 0:
"P has permanent label and equivalence class;
assign the equivalence class label."
PERM_LABEL(P):=LABEL(PLABEL);

end case;
P := P + 1
end
end while
16
end for
"Bottom-up pass"
for L := NROWS-1 to 1 by -1 do
P := ROW_START(L);
PLAST := ROW-END(L);
Q := R0WJSTART(L+1);
QLAST := R0W-END(L+1);
if P <> 0 and Q <> 0
then
begin
INITIALIZE-EQUIV( );
"SCAN 1"
while P < PLAST and Q < QLAST do
case
END-COL(P) < START_COL(Q):
P := P+1;
END_COL(Q) < START_COL(P):
Q := Q+1
else :
"There is some overlap; if the two adjacent runs have different labels,
then assign Q's label to run P."
begin
if PERM_LABEL(P) <> PERM_LABEL(Q) then
begin
LABEL(PERM_LABEL(P)) := PERM-LABEL(Q);

PERM_LABEL(P) := PERM_LABEL(Q)
end;
"Increment P or Q or both as necessary."
case
END_COL(P) > END_COL(Q):
Q := Q + 1
END-COL(Q) > END_COL(P):
P := P+1
END_COL(Q) = END-COL(P):
begin Q := Q+1; P := P+1 end
end case;
end
end case
end while
"SCAN 2"
P := ROW_START(L);
while P < PLAST do
"Replace P's label by its class label."
if LABEL(PERM_LABEL(P)) <> 0
then PERMXABEL(P) := LABEL(PERM_LABEL(P));
end while
end
end RUN_LENGTHJMPLEMENTATION
17
There is another significant difference between procedure RUN_LENGTH_ IMPLE-
MENTATION and procedure LOCAL_TABLE_METHOD, in addition to their use
of different data structures. Procedure LOCAL_TABLE_METHOD computes equiv-
alence classes using RESOLVE both in the top-down pass and in the bottom-up
pass.
Procedure RUNXENGTHJMPLEMENTATION updates equivalence classes in

the top-down pass using MAKEJEQUIVALENT, but in the bottom-up pass it only
propagates and replaces labels. This gives correct results not only in procedure
RUNXENGTHJMPLEMENTATION but also in procedure LOCAL_TABLEJV[ETHOD.
(This was proved in Lumia, Shapiro, and Zuniga, 1983.)
2.5. A 3D CONNECTED COMPONENTS ALGORITHM
The images handled by our algorithms so far are two-dimensional images. Both the
definition of connected components and the algorithms can be generahzed to three-
dimensional images, which are sequences of two-dimensional images called layers. The
generalization of the definition results straightforwardly from the generalization of the
concept of a neighborhood. Suppose that a 3D image consists of NROWS rows by NCOLS
columns by NLAYERS layers. Then the neighborhood of a voxel (in 3D we use this term
instead of "pixel") consists of voxels from neighboring rows, neighboring columns, and
neighboring layers. Kong and Rosenfeld (1989) define three standard kinds of 3D neigh-
borhoods: the 6-neighborhood, the 18-neighborhood, and the 26-neighborhood. These
are illustrated in Figure 9. Using these 3D neighborhoods, the definition of a 3D con-
nected component is identical to the definition of a 2D connected component. That is,
two black voxels p and q belong to the same connected component C if there is a sequence
of black voxels (po,Pi5
• • •
iPn) oi C where po =
P^
Pn = ^, and pi is a neighbor of pi-i for
i

1, ,n.
z
/
A
/
~/\

Z
p \
V-
/
v
A
7
tPi
ru
m
kH
m
Mj
(a)
(b) (c)
Figure 9. (a) Voxels, •, that are 6-neighbors of voxel p; (b) voxels •, that are 18-neighbors
of voxel p; (c) voxels, •, that are 26-neighbors voxel p.
The local table method of computing connected components was generalized to 3D by
Lumia (1983). The 3D algorithm can be summarized as follows:
18
1.
Label the 2D connected components in each layer in such a way that different labels
are used in different layers.
2.
Propagate the label equivalences from the first to the last layer, using the same basic
process as in the two-dimensional local table algorithm, except that the propagation
is now between the layers rather than between the rows. Repeat this process going
from the last layer to the first layer.
Figure 10 illustrates the 3D connected components algorithm on a simple 3-layer image.
If the local table method is used in each layer, its complexity for an n x n x n image is

O(n^) + (n^ times the complexity of RESOLVE for each row) -f (n times the complexity
of RESOLVE for each layer).
2.6. PARALLEL CONNECTED COMPONENTS ALGORITHMS
The use of parallel architectures can speed up the execution of most image processing
algorithms. For example, a simple point or neighborhood operation performed on a SIMD
(Single Instruction Multiple Data) architecture with one processor per pixel (or voxel)
can be completed in a constant amount of time on an image of any size. The connected
components operator cannot be executed in the same way, because it has to propagate
labels over entire components whose sizes are arbitrary.
Different parallel architectures lead to different algorithms. Danielsson and Tanimoto
(1983) described and analyzed a number of different algorithms for different architectures.
The most common algorithm is parallel propagation on a SIMD machine. The idea of the
algorithm is to start with a set of seed pixels, at least one per region of interest. Each
seed pixel gets a unique label. At each iteration of the algorithm, the label of a pixel is
propagated to its neighbors, using some suitable method for resolving conflicts as in the
sequential algorithms. When there is no further change to the image at some iteration,
the process is complete. A simple example of this procedure is an algorithm that begins
by assigning a unique label to each pixel of the image and then, at each iteration, replaces
the value of a pixel by the minimum of the values of itself and its neighbors. Manohar
and Ramapriyan (1989) present other, more advanced algorithms.
If there is one processor for each pixel of the image, the complexity of the parallel
propagation algorithm is 0(D), where D is the number of iterations needed to propagate
a label from each pixel to every other pixel in the same component. Each iteration takes
a constant amount of time, since it involves a neighborhood of constant size. When the
components are convex we have D < n ior an n x n image. If the components can be
nonconvex, D can be O(n^).
If the machine has only mxm processors, m <Cn, the image can be divided into n^/m^
windows, each of size mxm, and the parallel algorithm can be applied to each window;
equivalences must also be propagated between adjacent windows. This approach can be
used either on a SIMD machine or on a distributed multiprocessor machine.

Danielsson and Tanimoto showed that using a pyramid architecture can speed up par-
allel algorithms, and presented a components algorithm for a SIMD pyramid machine. In
such a machine, the base is an array of 2^ x 2^ nodes; the first (bottommost) level above
the base has one node (the parent) for each square block of four nodes in the base; the
second level above the base has one node for each square block of four nodes in the first
19
1
1
Layer
1
1
1
1
1
1
L
1
1
Layer 2
1
1
1
1 1
Layer 3
1
1
1
1 1
1
1

a) Input binary 3-layer image
6
5
6
6
5
6
6
5
4
3
4
4
3
4
2
2
2
2
2 2
1
1
b) Image labels after the two-dimensional connected components algorithm has
been run on each layer separately, using labels that decrease from layer 1 to
layer 3.
6
5
6
6
5

6
6
5
6
5
6
6
6
5 6
6
6
6
6
6
1
1
c) Image labels after the top-down pass of the three-dimensional part of the al-
gorithm
6
6
6
6
6
6
6
6
6
6
6
6

6
6
6
6
6
6
6 6
1
1
d) Image labels after the bottom-up pass of the three-dimensional part of the
algorithm. These are the final component labels.
Figure 10. Illustration of the Lumia 3D connected components algorithm.
20
level;
and so on, so
that
the
topmost level
has
only
a
single node.
The
neighborhood
of a
node
in a
pyramid includes
itself,
its

8-neighbors
on its own
level,
its
parent,
and its
four
children.
The
operation ANDPYR(X)
in a
pyramid
X
with
a
binary image
in its
base
and
zeroes everywhere else
is
defined
to
compute
the new
value
of
each node
at
each level

as
the logical
AND of the old
values
of its
four children.
The
conditional dilation operation
[DIL
I Y](X) is
defined
to
dilate
the
black pixels
in
pyramid
X, i.e. to
propagate black
values
(I's) to all
their pyramidal neighbors,
on the
condition that these neighbors
are
black pixels
in
pyramid
Y. If the
base

of X
initially contains
a
single black seed pixel,
the
base
of Y
initially contains
the
binary image
to be
labeled
and the
pixels
in the
remaining
nodes
of X and Y are
initially white, then
the
propagation algorithm
for the
component
of
the
binary image that contains
the
single seed pixel
is
given

by
PYRAMIDJPILL(X,Y)
= [DIL |
ANDPYR'(Y)]*^(X)
where
the
exponents denote repetition
and / is the
height
of the
pyramid
(n, for a 2" x 2"
base).
Figure
11
illustrates
the
PYRAMID .FILL algorithm
on a
simple image.
The
overall
complexity
of the
algorithm depends
on the
size
and
shape
of the

component
of the
binary inaage that contains
the
seed pixel.
If the
component
is
convex,
the
complexity
is
proportional
to the
logarithm
of its
diameter.
3.
ADJACENCY GRAPH CONSTRUCTION
3.1.
STATEMENT
OF THE
PROBLEM
This section
is
concerned with
the
spatial relationships among regions
in a
symbolic

(1?K^^'
1)
image.
We
assume that some kind
of
segmentation process
has
already assigned
to
^.
pixel
of the
image
a
symbolic label that represents
the
name
or
category
of the
pixel. I'hus
we are
starting with
a
symbolic image.
The
goal
is to
determine

the
spatial
adjacencies among
the
connected regions that have different labels
and
construct
a
graph,
the region adjacency graphs representing these spatial adjacencies.
The
graph
may
then
be
used
in
higher-level recognition/analysis algorithms. Figure
12
shows
a
simple symbolic
image representing
a
segmentation
and the
corresponding region adjacency graph.
3.2.
A
SPACE-EFFICIENT ALGORITHM

The algorithm
for
constructing
a
region adjacency graph
is
straightforward.
It
processes
the image, looking
at the
current
row and the one
above
it. It
detects horizontal
and
vertical adjacencies
(and if
8-adjacency
is
specified, diagonal adjacencies) between pixels
with different labels.
As new
adjacencies
are
detected,
new
edges
are

added
to the
region
adjacency graph data structure being constructed.
There
are two
issues related
to the
efficiency
of
this algorithm.
The
first involves space.
It
is
possible
for an
image
to
have tens
of
thousands
of
labels.
In
this case,
it may not be
feasible,
or at
least

not
appropriate
in a
paging environment,
to
keep
the
entire structure
in internal memory
at
once.
The
second issue involves execution time. When scanning
an image, pixel
by
pixel,
the
same adjacency
(i.e. the
adjacency
of the
same
two
region
labels) will
be
detected over
and
over again.
It is

desirable
to
enter each adjacency into
the data structure
as few
times
as
possible.

×