Information Processing
Letters 59 ( 1996) 289-294
Parallel maximum independent set in convex bipartite graphs
Artur Czumaj a,*, Krzysztof Diks b*l, Teresa M. PrzytyckaCq2
a Heinz
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
Nixdorflnstitute and Department of M athematics & Computer Science, Universiv of Paderbom, D- 33095
h Instytut Informatyki. Uniwersy tet W arszawski, PL- 02- 097
Paderborn, Germany
W arszawa. Poland
’ Department of Computer Science, University of M aryland, A.Y W illiams Bldg., College Park, M D 20742, USA
Received 20 January 1995; revised 19 August 1996
Communicated by M.J. Atallah zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPON
Abstract
A bipartite graph G = (VW E) is called convex if the vertices in W can be ordered in such a way that the elements of
adjacent to any vertex u E V form an interval (i.e. a sequence consecutively numbered vertices). Such a graph can be
represented in a compact form that requires O(n) space, where n = max{ IVI, 1WI}. G iven a convex bipartite graph G in the
compact form Dekel and Sahni designed an 0( log* (n) )-time, n-processor EREW PRAM algorithm to compute a maximum
matching in G. We show that the matching produced by their algorithm can be used to construct optimally in parallel a
maximum set of independent vertices. Our algorithm runs in 0( logn) time with n/ logn processors on an Arbitrary CRCW
PRAM.
W
Keywords: Bipartite
graphs;
Convex graphs;
Independent
set; PRAM algorithms
1. Introduction
An independent set of a graph is a subset of its vertices such that no two vertices in the subset are adjacent. The problem of finding a maximum cardinality
independent set (or shortly, the MIS problem) is one
of the most fundamental problems in graph theory. If
there are no restrictions on the input graph the MIS
problem is known to be NP-complete.
However, in
the case of bipartite graphs the MIS problem is closely
* Corresponding
author. Email:
Supported in part by DFG-Graduiertenkolleg
“Parallele Rechnemetzwerke in der Produktionstechnik”.
ME 872/4- 1.
t Email:
Partly supported by EC Cooperative Action K-1000
(project ALTEC: Algorithms for Future
Technologies).
* Email:
0020-0190/96/$12.00
Copyright
PII SOO20-0190(96)00131-7
@ 1996 Published
related to a maximum matching problem and hence it
can be solved in polynomial time [ 61.
A subset M of edges of a graph G = (YE) is a
matching if no two edges in M are incident to the
same vertex; A4 is of maximum cardinality (or simply,
a maximum matching) if it contains the maximum
number of edges. The problem of finding a maximum
cardinality matching is called the maximum matching
problem.
In this paper we address the problem of finding in
parallel a maximum independent set in a special class
of graphs - convex bipartite graphs.
Let G = (Vu! E) be an undirected bipartite graph,
where Y W are sets of vertices and E is a set of edges
of the form (u, w), with u E V and w E W. The graph
G is convex if there is an ordering “<” of the elements
of W such that the vertices of W connected to any u E
by Elsevier Science B.V. All rights reserved.
A. Czumaj et al./Information
Processing Letters 59 (1996) 289-294
290 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
ks
“I
WI
end
w is the smallest vertex among neighbors of u in W
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDC
W2
not matched by any vertex u E V smaller than U.
WI
V2
V
W2
WI
w3
W2
W3
W2
w4
W3
W4
W
“3
w3
y4
W4
“s
Fig. 1. A convex bipartite
graph and its compact
representation
V form an interval in this ordering, i.e. for any u E V
and WI, w, w2 E W such that wi < w < ~2,
(u,wi)
E Eand
(u,w2)
E E +
(u,w)
E E.
Let n = max{ 1VI, IWl}. The number n is called size
of G. Without loss of generality we will consider only
graphs without isolated vertices.
Convex bipartite graphs were originally discussed
by Glover [ 71 and next by Lipski and Preparata [ 91.
It is typical for applications involving convex bipartite
graphs that the graph G = (VW E) is given by specifying the ordering “<” and by specifying the endpoints beg(u) and end(u) of the interval of the elements of W connected to U, for every u E V. Observe
that the size of such representation does not depend
on the number of edges in the graph. We call this representation compact. If additionally the vertices of V
are ordered with respect to the end values then such
representation is called the sorted compact representation. Note that the sorted compact representation can
be obtained from the convex representation by integer
sorting.
An example of a convex graph and its sorted compact representation are given in Fig. 1.
Given the sorted compact representation of a convex
graph G = (VU: E), Dekel and Sahni [ 4 ] designed
an 0( lo?( n) )-time, n-processor EREW PRAM algorithm to compute a maximum matching in G. The
algorithm of Dekel and Sahni is an example of a parallel greedy algorithm. It produces the (greedy) matching M which has the following properties (recall that
both sets V, W are ordered):
l the smallest
vertex in V is matched with its smallest
neighbor in W;
l if u E V is not the smallest
vertex and it is matched
in M with a vertex w E W (i.e. (u, w) E M) then
In a case of convex bipartite graphs such a matching
is a maximum one.
In the sequential setting both, a maximum matching
and a maximum independent set in a convex bipartite
graph, can be computed in linear time [ 591.
In this paper, we show that given the greedy matching in a convex bipartite graph G one can compute a
maximum independent set in G in time O(logn)
and
with n/log n CRCW processors, where IZ is the size
of the input graph.
To this end we give a parallel implementation
of
the following well known algorithm for computing a
maximum independent set in a bipartite graph G =
(K K E) given a maximum matching M [ 63. zyxwvutsrqponm
Algorithm MIS.
Direct every edge e E M from W tb V and every
edgeeEE\MfromVtoW.
Let VObe the set of unmatched vertices in V, find
the sets VI 2 V and WI G W of vertices reachable
from VO(VI includes VO).
Construct the maximum independent set as I =
KU(W\Wl>.
Thus the problem is reduced to finding all the vertices in G which are reachable from the unmatched
vertices in V. In parallel setting, this problem (for
general bipartite graphs) falls into the group of problems with so called “transitiveclosure
bottleneck” [ 81.
However, if the input graph is convex and represented
in the sorted compact form and if the matching M
is the greedy one then, as we show, our reachability
problem has an interesting structure allowing us to design an optimal parallel algorithm for the maximum
independent set problem.
2. The algorithm
Consider a convex bipartite graph G = ( YK E)
given in the sorted compact form. For simplicity of
further considerations we assume that both V and W
are given as the sequences of integers 1 to /VI and 1 to
1WI, respectively. Let M be the greedy matching in G.
Direct every edge e E M from W to V and every
edge e E E \ M from V to W. (We do not assign the
A. Czumaj et al. /Informaiion Processing Letters 59 (1996) 289-294
291
Proof. The proof is by induction on k.
orientation to every single edge, as this would require
Ifk= 1 thenR’(i)
= [b’(i),e’(i)],whereb’(i)
=
0( [RI) work, but simply add the corresponding inforbeg(i) and e’(i) = end(i). Assume that k > 1 and
mation to the compact representation of the graph.) A
that the lemma holds for k - 1,
directed edge going from a vertex u to a vertex w will
= [bk-‘(i),ek-l(i)].
For j E
Let Rk-’ (i)
be denoted by u -+ w.
N( Rk-’ (i))
let m(j)
be the vertex for which
For a set X 2 V zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
U W let N(X) denote the set of
“outgoing” neighbors of the vertices in X, i.e.
(j,m(j))
E M. Then N(j) = EbegW,end(j)l \
{m(j)}.
Thus by t h e inductive hypothesis, it follows
N(X) = {w 1 u E X and u --f w
that Rk(i) = [bk(i),ek(i)],
where
is a directed edge in G}.
bk(i> =min({bk-l(i)}
For every integer k > 1 let Rk( X) be the set of
vertices defined as follows:
Rk(X)
=
N(X),
I
N(N(Rk-l(X)>>
k= 1,
U Rk-‘(X),
k > 1.
Let R(X) denote the set of all vertices reachable
from X in an odd number of steps. Observe that if u is
reachable from X then it is reachable in at most n - 1
steps. Thus R(X)
= Ur’J’2’
k-l Rk(X).
For simplicity we will write N(i), Rk (i), R(i) instead of N( {i}), Rk( {i}), R( {i}), respectively.
Let VObe the set of all unmatched vertices in V. Our
goal, as pointed in Introduction, is to compute the sets
WI = R( Vo) and VI = N( WI ). It is sufficient to show
how to compute R( VO) - the set of all vertices in W
reachable from the unmatched vertices in V. (Since
v,u) E ManduE
Wl}onecan
V=&U{uEVI(
easily compute VI from WI and M in constant time
with n processors.)
The basic idea of our approach is as follows. First,
we show that for each i E Vi, R(i) is an interval
whose second endpoint is end(i) . The first endpoints
for all i can be computed in O( 1) time with n processors. Then, given a sequence of intervals sorted with
respect end values, we show how to compute the representation of the union of these intervals as a union
of disjoint intervals. With this representation, we can
decide if i E R( VO), for all vertex i E W, in constant
time with linear number of processors.
In order to present our algorithm precisely we need
the following lemmas. (Recall that the graph is given
in the sorted compact form.)
Lemma 1. For every i E VOand every integer k 3 1
the elements of Rk (i) form un interval [ bk (i) , ek (i) ]
in H!
1j E N(Rk-‘(i))}),
U {beg(j)
and
ek(i> =max({ek-l(i)}
U{end(j)
Lemma2.
then j < i.
1j E N(Rk-l(i))}).
Let i c hand
k 2 1. rfj
0
E N(Rk(i))
Proof. Induction on k.
k = 1: If j E N( R’ (i) > then there is a unique
vertex k E R’(i) such that (j, k) l M. Since i is
unmatched and M is the greedy matching then j < i.
k > 1: Assume that the lemma holds for k - 1.
Consider the set N( Rk( i) \ Rk-’ (i) ). If this set is
empty then the lemma holds for k obviously. Suppose
that it is non-empty and let j E N( Rk (i) \ Rk-’ (i) ) .
Then there are p E Rk-‘(i),q
E N(Rk-‘(i))
and
r E Rk(i) \ Rk-‘(i)
such that (q,p) E M,(q,r)
E
E \ M and (j, r) E M. By the induction hypothesis,
q < i and therefore end(q) < end(i) . Since Rk-’ (i)
is an interval in W, thus r < p. This observation and
the fact that M is the greedy matching imply j < q,
otherwise the edge (q, r) would belong to M instead
of (4,~)
and (j,r).
El
Lemma3.
Zfi E VOthen R(i) = [b(i),e(i)],where
b(i) = br”/21(i) and e(i) = end(i).
Proof. The lemma follows immediately from Lemmas 1,2 and the fact that end(j) < end(i), for every
j
Cl
Our next step is to show how to compute
every i E VO (recall that e(i) =end(i)).
b(i)
for
A. Czumaj et al/Information Processing Letters 59 (1996) 289-294
292
For every i E V let Q(i) be the set of all its neighbors (not only the outgoing ones) matched in M and
letq(i) beavertexinN(Q(i))
suchthatbeg(q(i))
=
min({beg(j>
/ j E N(Q(i)))).
Define a function
next : v -+ v as follows
next(i)
=
i.
1,
beg(i)
q(i),
otherwise.
G b&q(i)),
Since
beg(ne&‘(i))
< beg(ne&(i))
for every k 2 0, we get a contradiction.
0
Corollary 5. For every i E Vo
R(i) = [beg(next*(i)),end(i)].
Informally, next( i) is the vertex of V with the smallest
If all next(j)
are known then one can compute
beg value that can be reached from the vertex i in at
next* (j) in 0( log n) time with n/ log n processors usmost 2 steps via its matched neighbors in W.
ing the tree contraction technique. We must be careful
Let next0(i)
= i,ne&(i)
= next(ne&‘(i)),
here, as tree contraction algorithms are usually prefor every k 2 1, and let next*(i) = j be such that
sented in the context of a tree where each internal node
ne_xt( j) = j and nextk (i) = j, for some k 2 0. Obhas associated with a list of its children. Unfortunately
serve that if i # next(i) then beg(next(i))
< beg(i).
in our application this is not the case. For completeThus pointers next are the parent pointers of a rooted
ness of the presentation we show, in the Appendix, a
forest. Furthermore, the function next has the followtechnique that allows to avoid this restriction.
ing important property: zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
We concluded that computing of intkrvals R(i) reLemma 4.
Rk(i)
For every i E V, and every integer k 3 1
= [beg(ne&‘(i)),end(i)].
Proof. The proof is by induction on k.
Since nat”( i) = i and R’(i) = [beg(i),end(i)]
the lemma holds for k = 1.
Assume that k > 1 and the lemma holds for every
positive integer 1 < k.
By the induction hypothesis,
Rk-‘(i)
= [beg(nextk-*(i)),end(i)].
It follows from the definition of the function next
that Rk(i)
> [beg(ne&-](i)),end(i)].
Suppose
that there is j E Rk(i) \ [beg(nextk-‘(i)),end(i)].
Then there are p E Rk-‘(i),t
E N(Rk-l(i))
such
that (t,p)
E M and (t, j) E E \ M. Notice that
beg(t) < j < beg(ne&‘(i)).
Let ko 6 k - 1 be
the smallest positive integer such that p E Rb (i) . By
the induction hypothesis,
p E [beg(nextk0-‘(i)),end(i)l
and
duces to computing function next. Function next can
be computed in 0( log n) time with n/ log n processors as follows.
LetW=
(WI,. . . , wlMl ) be the increasing sequence
of the vertices of W matched in M. W is easy to compute using the prefix computation. Moreover, using the
prefix computations one can easily compute two tables
such that for every w E W,
A[ l..lWl] and C[l..lWJ]
A[ w] is the largest index j such that Wj < w and
C [ w] is the smallest index j such that Wj > w. For every i E V let a(i) and c(i) be the smallest and largest
indices such that Wa(i) 2 beg(i) and WC(i) < end(i).
Indices a(i) and c(i) can be computed with a linear
work using tables A and C. Observe now that q(i) is a
vertex with the minimum beg value among all vertices
in V matched with vertices Wa(i), Wa(i)+l, . . . , WC(i).
In order to compute q(i) (and hence next(i) ) , for all
i E V, one can apply the algorithm for the range minimum searching problem [ 21. It takes 0( log n) time
on an n/ log n-processor CREW PRAM.
Thus, we know how to compute a representation of
R( VO) as a union of n intervals sorted with respect to
the second endpoint. Our final step is to simplify this
representation to the union of non-intersecting
intervals.
p 4 [beg(ne&‘-*(i)),end(i)].
Then
beg(ne_&(i))
< beg(t)
< j < beg(nex@‘(i)).
Lemma6.
Let II,... ZP where Zi = [ bi, ei] be the
sequence of intervals such thatfor i < j, ei 6 ej. Then,
the set of intervals Z{, . . . , Z: such that for any i # j
A. czumaj
et al. /Information Processing
Letters 59 (1996) 289-294
293
zi’nz~=0andz~Uz~...Uz,=z~Uz~...Uz~cunbe
the non-optimal algorithm of Miller and Reif [lo],
computed in 0( log n) time with n/ log n processors. zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFED
and finally combine the information from the smaller
problem to all other nodes in the forest.
Proof. First, we eliminate every interval Zj such that
Zj is contained in an interval Zf with j’ > j. To find
these intervals we consider the sequence of the first
A.Z. TheJirst phase
endpoints of the intervals II, . . , Zp. For each element
in this sequences we find closest dominating (i.e. not
The first phase reduces the number of vertices in the
forest to at most n/ log n. Let P = n/ log n, A be the
larger than the given element) successor. If such a
array containing all the nodes of F, B an empty array
successor exists, then the interval is eliminated. This
of length n, and for each node u, succ( u) = p(u).
can be done in O(logn)
time with n/ logn procesWe repeat the following process until the size of A
sors [ 21. In this way we obtain a sequence of intervals
is smaller than n/log n.
sorted with respect to both endpoints. Now, the intervals Ii,.. . Z: can be computed using the list ranking
Finding all leaves and chains. Split the nodes in A
technique.
0
into three groups: (i) the leaves L, that is, all nodes v
E A that for no u E A, succ( u) = u; (ii) the nodes on
Thus we can conclude the paper with the following
chains C, that is, the nodes u E A - L such that for
theorem.
exactly one u E A, succ(u) = u; and (iii) the other
nodes. One can easily verify which node belongs to
Theorem 7. Given a (sorted, compact representawhich of these sets in 0( 1Al/P) time with P procestion of) convex bipartite graph G = ( Y W, E) of size n
sors, and then, using the prefix sums algorithm of Cole
and the greedy matching M one can compute a maxiand Vishkin [ 31, rearrange them to store in consecumum independent set of vertices in G in 0( log n) time
tivepartsofAintimeO(~A~/P+log~A~/loglog~A~)
using n/ log n processors of a CRCW PRAM.
with P processors.
Appendix A. Trre contraction in the absence of
an Euler tour
In this Appendix, we show how to solve optimally
the rooting problem in a forest. Given a forest F defined by the parent’s relation (i.e., each node u has
a pointer p(v) to its parent) with nodes { 1,. . . ,n}.
A node u is a root if p(u) = u. The rooting problem
is to find for each node u the root r(u) of the tree it
belongs to.
Given an Euler tour of each tree (or given for every
node the list of its children), the rooting problem can
be solved in 0( log n) time with n/ log n processors using standard tree-contraction algorithms [ 10,l ] . However this technique cannot deal with unbounded degree trees when no Euler tour is given. We show an
approach that circumvents this assumption and design
an 0( logn)-time
algorithm for the rooting problem
that employs O(n) operations.
Our algorithm consists of three phases. First, we
reduce the problem of size n to the problem of size
n/ log n. Then we solve the smaller problem using
Remove all leaves. If u E L then we set the pointer
to the node which will find the root of u, PT(u) =
succ( u), and remove u from the array A. All leaves
are stored at the first free entries of an array B.
Halves the chains. C is a collection of lists. Using the algorithm of Cole and Vishkin [3], find a
maximal independent set MIS in C in O(lCl/P
+
log ICI / log log 1C I) time with P processors. Additionally we require that the last element from each list
belongs to MIS.
l For each node u E C - MIS, if succ(u)
E MIS,
then PT( u) = succ( u), and otherwise PT( u) =
succ(succ(u)).
Remove u from A and store all
nodes from E C - MIS at the first free entries of B.
l For each node from MIS that is not the last vertex on a chain, if succ(succ(u))
E MIS then
set succ( u) = succ(succ( v)), and otherwise set
succ(u) =succ(succ(succ(u))).
Fact 8. Phase 1 can be pelformed
with P = n/ log n processors.
in O(logn)
time
294
A. Czumaj et al./Informa~ion Processing Letters 59 (1996) 289-294 zyxwvutsrqponmlkjihgfedcbaZYXW
zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
This finally leads to the following theorem.
Proof. Let N, denote the size of A before iteration t.
Standard arguments (see e.g. [ zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONMLKJIHGFEDCBA
10,l ] ) can be applied
Theorem 9. The rooting problem can be solved in
to show that N,+I < $N,. Hence N,+l 6 n(t)’ and
0( log n) with n/ log zyxwvutsrqponmlkjihgfedcbaZYXW
n processors on a CRCW PRAM .
there are at most log log n iteration of the loop.
The running time of iteration t is O( N,/P +
log N,/ log log NI) with P processors. Summing this
References
together we get the running time bounded by
log log ”
c
0
El
($+
log log N, >
log log n
Q
+
c
O(N,)
r=l
= O(logn).
111 K. Abrahamson, N. Dadoun, D.G. Kirkpatrick and T.
log Nr
+loglognO
log Nr
( log log Nt >
0
A.2. The second phase
Now we perform standard tree-contraction
algorithm (e.g. [ lo] ) for the forest defined by the relation
WCC in the array A. It runs in 0( log N) time and uses
N processors, where N is the number of vertices in
the forest. Since in our case N = n/ log n, this yields
to an 0( log n)-time n/ log n-processors algorithm.
A.3. The third phase
In this step we have to combine the information
computed for the nodes that were left after Phase 1 to
obtain the pointers to the root for all the nodes in F.
Observe that the nodes stored in B are ordered with
respect to the time when they were removed from A.
This gives us a partition of B into blocks of nodes
that were removed at the same iteration. Since they
are at most log log n blocks, we can analyze them successively, one by one, in the reverse order of the time
when the nodes from given block were removed. Then
using the information of the root of all the nodes in
the already analyzed part of B, we can compute r(u)
using the pointer PT( 0). If Bi denotes the size of the
ith block, then each block can be analyzed in constant time using Bi processors, or in O(Bi/P) time
using P processors. Summing this over all blocks we
get the 0( log n) running time of the third phase with
P = n/ log n processors.
Przytycka, A simple parallel tree contraction algorithm, J.
Algorithms 10 (1989) 287-302.
[21 0. Berkman, B. Schieber and U. Vishkin, Optimal doubly
logarithmic parallel algorithms based on finding all nearest
smaller values, J. Algorithms 14 ( 1993).
R. Cole and U. Vishkin, Faster optimal parallel prefix sums
[31 zyxwvutsrqponmlkjihgfedcbaZYXWVUTSRQPONML
and list ranking, Inform. and Compur. 81 (3) (1989) 334352.
[41 E. Dekel and S. Sahni, A parallel matching algorithm for
convex bipartite graphs and applications to scheduling, J.
Parallel Distributed Compur. 1 (1984) 185-205.
[51 H.N. Gabow and R.E. Tarjan, A linear-time algorithm for a
special case of disjoint set union, J. Cornput. System Sci. 30
(1985) 209-221.
[61 E Gavril, Testing for equality between maximum matching
and minimum node covering, Inform. Process. L&t. 6 ( 1977)
199-202.
[71 E Glover, Maximum matching in a convex bipartite graph,
Naval Rex Logist. Quart. 14 (1967) 313-316.
[81 R.M. Karp and V. Ramachandran, A survey of parallel
algorithms for shared-memory machines, in: J. van Leeuwen,
ed., Handbook of Theoretical Computer Science, Volume A:
Algorirhms and Complexity (Elsevier, Amsterdam, 1990)
Chapter 17, pp. 869-941.
91 W. Lipski and F.P. Preparata, Efficient algorithms for finding
maximum matchings in convex bipartite graphs and related
problems, Acta Inform. 15 (1981) 329-346.
lo] G.L. Miller and J.H. Reif, Parallel tree contraction, in: Proc.
26th IEEE Symp. on Foundationsof Computer Science ( 1985)
478-489.