Distributed Database
Management Systems
Lecture 20
In the Previous Lecture
• Continued with VF
–Computed CA
–Partitioning Algorithm
In this Lecture
• Continue with VF
–Hybrid Fragmentation
–Allocation Problem
–Replication
A1
A3
A2
A4
A1
45 45 0
0
A3
45 53 5
3
A2
0
5
A4
0
3 75 78
CA
80 75
A1 A2 A3 A4
S1 S2 S3
q1 1
0
1
0
q1 15 20 10
q2 0
1
1
0
q2 5
q3 0
1
0
1
q3 25 25 25
q4 0
0 1 1
refj(qi)
z2 = 3311
z1 = 0 – 452
2
z3= 0 78
0
0
q4 3 0 0
accj(qi)
A1= jNo A2= jName
A3= budget A4= loc
V1 = {jNo, budget}
V2 = {jNo, jName, loc}
VF- Two Problems
1- Clusters not in the
sides, rather in the
middle of CA
2- m-way partitioning
VF Correctness
• A relation R, defined over
attribute set A and key K,
generates the vertical
partitioning
• FR = {R1, R2 , …, Rr }
• Completeness: The
following should be true
for A
A =U Ri
• Reconstruction: can be
achieved by
R = ⋈K Ri, ∀Ri ∈ FR
• Disjointness: TID's are not
considered to be
overlapping since they are
maintained by the system
• PK is exception
Hybrid
Fragmentation
Practically,
applications require
the fragmentation of
both the types to be
combined
So the nesting of
fragmentations, i.e.,
one following the
other, it becomes
sort of a tree
• Disjoint ness and
completeness have to
be assured at each
step, and
reconstruction can be
obtained by applying
Join and Union in
reverse order
CUST
Beta = ΠA/C#, Bal (CUST)
A/C#
AB101
AB202
AB203
AB109
Name
Saeed
Laeeq
Salma
Shaan
Bal
4535
45632.34
67839.87
45.32
Branch
MTN
LHR
LHR
MTN
Delta1 = σ Loc = “MTN” (ΠA/C#, Name, Branch (CUST))
Delta2 = σ Loc = “LHR” (ΠA/C#, Name, Branch (CUST))
Beta
Delta1
Delta2
A/C#
Bal
AB101
4535
A/C#
Name
Branch
A/C#
Name
Branch
AB202
45632.34
AB101
Saeed
MTN
AB202
Laeeq
LHR
AB203
67839.87
AB109
Shaan
MTN
AB203
Salma
LHR
AB109
45.32
Allocation
• Given F = {F1, F2 , …, Fn}
fragments
• S ={S1 , S2 , …, Sm}
network sites
• Q = {q1, q2 ,…, qq }
applications
• Find the "optimal"
distribution of F to S.
• Optimality
–Minimize the processing
cost and maximize the
system throughput at
each site
It is a complex problem
to be solved
mathematically, to make
the things very simple,
consider the allocation of
a single fragment Fk,
• set of read only
queries on Fk from Si;
T = {t1, t2, …, tm}
• set of update queries
U on Fk from Si;
U= {u1, u2, .., um}
Communication Cost
C(T) = {c1,2, c1,3, …., c1,m,
….cm-1, m}
C’(T) = {c’1,2, c’1,3, …., c’1,m,
….c’m-1, m}
Storage Cost
D = {d1, d2, ……., dm}
Allocation problem is
to find the cites out
of set of sites S,
where the copy of Fk
will be stored.
1 if the fragment Fk is assigned to site Sj
xj =
0 otherwise
The specification of the
allocation problem will be
min
m
i 1
xjujc ' ij
j|Sj I
tj min cij
j|sj I
xjdj
j|Sj I
• That concludes our
discussion on
Fragmentation
• Lets summarize it
• Fragmentation is
splitting a table into
smaller tables
• Alternatives
– Horizontal
–Vertical
–Hybrid
Horizontal
Fragmentation