Lecture Notes for Algorithm Analysis and Design
Sandeep Sen1
March 16, 2008
1 Department
of Computer Science and Engineering, IIT Delhi, New Delhi 110016, India.
E-mail:
Contents
1 Model and Analysis
1.1 Computing Fibonacci numbers .
1.2 Fast Multiplication . . . . . . .
1.3 Model of Computation . . . . .
1.4 Other models . . . . . . . . . .
1.4.1 External memory model
1.4.2 Parallel Model . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
2 Warm up problems
2.1 Euclid’s algorithm for GCD . . . .
2.1.1 Extended Euclid’s algorithm
2.2 Finding the k-th element . . . . . .
2.2.1 Choosing a random splitter
2.2.2 Median of medians . . . . .
2.3 Sorting words . . . . . . . . . . . .
2.4 Mergeable heaps . . . . . . . . . .
2.4.1 Merging Binomial Heaps . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Optimization I :
Brute force and Greedy strategy
3.1 Heuristic search approaches . . . . . . . . . .
3.1.1 Game Trees * . . . . . . . . . . . . . .
3.2 A framework for Greedy Algorithms . . . . . .
3.2.1 Maximal Spanning Tree . . . . . . . .
3.2.2 A Scheduling Problem . . . . . . . . .
3.3 Efficient data structures for MST algorithms .
3.3.1 A simple data structure for union-find
3.3.2 A faster scheme . . . . . . . . . . . . .
3.3.3 The slowest growing function ? . . . .
3.3.4 Putting things together . . . . . . . . .
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
8
10
11
11
12
.
.
.
.
.
.
.
.
14
14
15
16
17
18
18
20
21
.
.
.
.
.
.
.
.
.
.
22
22
24
26
28
29
29
30
31
32
33
3.3.5
Path compression only . . . . . . . . . . . . . . . . . . . . . .
4 Optimization II :
Dynamic Programming
4.1 A generic dynamic programming formulation . . . . . . .
4.2 Illustrative examples . . . . . . . . . . . . . . . . . . . .
4.2.1 Context Free Parsing . . . . . . . . . . . . . . . .
4.2.2 Longest monotonic subsequence . . . . . . . . . .
4.2.3 Function approximation . . . . . . . . . . . . . .
4.2.4 Viterbi’s algorithm for Expectation Maximization
34
.
.
.
.
.
.
35
36
36
36
37
38
39
.
.
.
.
.
.
.
.
.
41
41
41
42
44
46
47
48
49
49
.
.
.
.
.
.
.
.
.
.
.
.
51
51
52
53
55
56
57
57
57
58
59
61
62
7 Fast Fourier Transform and Applications
7.1 Polynomial evaluation and interpolation . . . . . . . . . . . . . . . .
7.2 Cooley-Tukey algorithm . . . . . . . . . . . . . . . . . . . . . . . . .
7.3 The butterfly network . . . . . . . . . . . . . . . . . . . . . . . . . .
65
65
66
68
5 Searching
5.1 Skip Lists - a simple dictionary . . . . . . . . .
5.1.1 Construction of Skip-lists . . . . . . . . .
5.1.2 Analysis . . . . . . . . . . . . . . . . . .
5.2 Treaps : Randomized Search Trees . . . . . . .
5.3 Universal Hashing . . . . . . . . . . . . . . . . .
5.3.1 Example of a Universal Hash function .
5.4 Perfect Hash function . . . . . . . . . . . . . . .
5.4.1 Converting expected bound to worst case
5.5 A log log N priority queue . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
bound
. . . .
.
.
.
.
.
.
.
.
.
6 Multidimensional Searching and Geometric algorithms
6.1 Interval Trees and Range Trees . . . . . . . . . . . . . .
6.1.1 Two Dimensional Range Queries . . . . . . . . .
6.2 k-d trees . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Priority Search Trees . . . . . . . . . . . . . . . . . . . .
6.4 Planar Convex Hull . . . . . . . . . . . . . . . . . . . . .
6.4.1 Jarvis March . . . . . . . . . . . . . . . . . . . .
6.4.2 Graham’s Scan . . . . . . . . . . . . . . . . . . .
6.4.3 Sorting and Convex hulls . . . . . . . . . . . . . .
6.5 A Quickhull Algorithm . . . . . . . . . . . . . . . . . . .
6.5.1 Analysis . . . . . . . . . . . . . . . . . . . . . . .
6.5.2 Expected running time ∗ . . . . . . . . . . . . . .
6.6 Point location using persistent data structure . . . . . .
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7.4
Schonage and Strassen’s fast multiplication . . . . . . . . . . . . . . .
8 String matching and finger printing
8.1 Rabin Karp fingerprinting . . . . . . .
8.2 KMP algorithm . . . . . . . . . . . . .
8.2.1 Potential method and amortized
8.2.2 Analysis of the KMP algorithm
8.2.3 Pattern Analysis . . . . . . . .
8.3 Generalized String matching . . . . . .
8.3.1 Convolution based approach . .
. . . . .
. . . . .
analysis
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
69
.
.
.
.
.
.
.
72
72
74
75
75
76
76
76
9 Graph Algorithms
9.1 Applications of DFS . . . . . . . . . . . . . . . . . . . .
9.1.1 Strongly Connected Components (SCC) . . . . .
9.1.2 Finding Biconnected Components (BCC) . . . .
9.2 Path problems . . . . . . . . . . . . . . . . . . . . . . . .
9.2.1 Bellman Ford SSSP Algorithm . . . . . . . . . . .
9.2.2 Dijkstra’s SSSP algorithm . . . . . . . . . . . . .
9.2.3 Floyd-Warshall APSP algorithm . . . . . . . . . .
9.3 Maximum flows in graphs . . . . . . . . . . . . . . . . .
9.3.1 Max Flow Min Cut . . . . . . . . . . . . . . . . .
9.3.2 Ford and Fulkerson method . . . . . . . . . . . .
9.3.3 Edmond Karp augmentation strategy . . . . . . .
9.3.4 Monotonicity Lemma and bounding the iterations
9.4 Global Mincut . . . . . . . . . . . . . . . . . . . . . . . .
9.4.1 The contraction algorithm . . . . . . . . . . . . .
9.4.2 Probability of mincut . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
79
79
79
80
82
82
84
85
85
87
88
88
88
90
90
91
10 NP Completeness and Approximation Algorithms
10.1 Classes and reducibility . . . . . . . . . . . . . . . .
10.2 Cook Levin theorem . . . . . . . . . . . . . . . . .
10.3 Common NP complete problems . . . . . . . . . . .
10.3.1 Other important complexity classes . . . . .
10.4 Combating hardness with approximation . . . . . .
10.4.1 Equal partition . . . . . . . . . . . . . . . .
10.4.2 Greedy set cover . . . . . . . . . . . . . . .
10.4.3 The metric TSP problem . . . . . . . . . . .
10.4.4 Three colouring . . . . . . . . . . . . . . . .
10.4.5 Maxcut . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
93
94
95
97
97
99
99
100
101
102
102
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
A Recurrences and generating functions
A.1 An iterative method - summation . .
A.2 Linear recurrence equations . . . . .
A.2.1 Homogeneous equations . . .
A.2.2 Inhomogeneous equations . .
A.3 Generating functions . . . . . . . . .
A.3.1 Binomial theorem . . . . . . .
A.4 Exponential generating functions . .
A.5 Recurrences with two variables . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
104
104
106
106
107
108
109
109
110
B Refresher in discrete probability and probabilistic inequalities
112
B.1 Probability generating functions . . . . . . . . . . . . . . . . . . . . . 113
B.1.1 Probabilistic inequalities . . . . . . . . . . . . . . . . . . . . . 114
4
If we have knowledge of the second moment, then the following gives a stronger
result
Chebychev’s inequality
Pr[(X − E[X])2 ≥ t] ≤
σ2
t
(B.1.2)
where σ is the variance, i.e. E 2 [X] − E[X 2 ].
With knowledge of higher moments, then we have the following inequality. If
n
X =
i xi is the sum of n mutually independent random variables where xi is
uniformly distributed in {-1 , +1 }, then for any δ > 0,
Chernoff bounds
Pr[X ≥ ∆] ≤ e−λ∆ E[eλX ]
(B.1.3)
2
−λ
λ
If we choose λ = ∆/n, the RHS becomes e−∆ /2n using a result that e 2+e =
2
coshh(λ) ≤ eλ /2 .
A more useful form of the above inequality is for a situation where a random
variable X is the sum of n independent 0-1 valued Poisson trials with a success probability of pi in each trial. If i pi = np, the following equations give us concentration
bounds of deviation of X from the expected value of np. The first equation is more
useful for large deviations whereas the other two are useful for small deviations from
a large expected value.
P rob(X ≥ m) ≤
np
m
m
em−np
P rob(X ≤ (1 − ǫ)pn) ≤ exp(−ǫ2 np/2)
P rob(X ≥ (1 + ǫ)np) ≤ exp(−ǫ2 np/3)
(B.1.4)
(B.1.5)
(B.1.6)
for all 0 < ǫ < 1.
A special case of non-independent random variables
Consider n 0-1 random variables y1 , y2 , . . . yn such that Pr[yi = 1] ≤ pi and pi =
np. The random variables are not known to be independent. In such a case, we will
not be able to directly invoke the previous Chernoff bounds directly but we will show
the following
Lemma B.3 Let Y =
trials with xi = pi . Then
i
yi and let X =
i
xi where xi are independent Poisson
Pr[Y ≥ k] ≤ [X ≥ k]∀k, 0 ≤ k ≤ n
In this case the random variable X is known to stochastically dominate Y.
115
Therefore we can invoke the Chernoff bounds on X to obtain a bound on Y . We
will prove the above property by induction on i (number of variables). For i = 1, (for
all k) this is true by definition. Suppose this is true for i < t (for all k ≤ i) and let
i = t. Let Xi = x1 + x2 . . . xi and Yi = y1 + y2 . . . yi. Then
Pr[Xt ≥ k] = Pr[Xt−1 ≥ k] + Pr[Xt−1 = k − 1 ∩ xt = 1]
Since xi ’s are independent, we can rewrite the above equation as
Pr[Xt ≥ k] = (pt + 1 − pt ) Pr[Xt−1 ≥ k] + Pr[Xt−1 = k − 1] · pt
= pt (Pr[Xt−1 ≥ k] + Pr[Xt−1 = k − 1]) + (1 − pt ) · Pr[Xt−1 ≥ k]
= pt Pr[Xt−1 ≥ k − 1] + (1 − pt ) · Pr[Xt−1 ≥ k]
Similarly
Pr[Yt ≥ k] ≤ pt (Pr[Yt−1 ≥ k − 1] + (1 − pt ) · Pr[Yt−1 ≥ k]
where the inequality exists because Pr[Yt−1 = k − 1 ∩ yt = 1] = Pr[yt = 1|Yt−1 =
k − 1] · Pr[Yt−1 = k − 1] ≤ pt · Pr[Yt−1 = k − 1]. By comparing the two equations term
by term and invoking induction hypothesis, the result follows.
116