Distributed Database
Management Systems
Lecture 19
In the Previous Lecture
• Continued with VF
–Global Affinity Measure
–Bond Energy Algorithm
In this Lecture
• Continue with VF
–Example of VF
–Partitioning in CA
–Allocation Problem
A1
A1 45
A2 0
A2
0
A3
45
A4
0
80 5
A3 45 5 53
A4 0 75 3
75
3
78
A1 A2
A1 45 0
A2 0 80
A3
A4
45
0
5
75
Compute the contribution by placing
3rd attribute at different places in CA
• Ordering (0-3-1) –
means A3 on left of A1
• Ordering (1-3-2) –
means A3 in between A1
and A2
• Ordering (2-3-4) –
means A3 on right of A4
cont (A0, A3, A1) =
2 bond(A0, A3) + 2 bond(A3, A1) –
2bond(A0, A1)
bond(A0, A3) = 0
and
(A0, A1) = 0
because A0 refers to the left of
leftmost attribute
4
bond(A3, A1) = ∑z = 1aff(Az, A3)aff(Az, A1)
= aff(A1, A3)aff(A1, A1) + aff(A2, A3) aff(A2,
A1) + aff(A3, A3)aff(A3, A1) + aff(A4,
A3)aff(A4, A1)
= 45 * 45 + 5 * 0 + 53 * 45 +
3 * 0
= 4410
• Thus cont(A0, A3, A1)
= 2 bond(A0, A3) +
2 bond(A3, A1) 2bond(A0, A1)
= 2 * 0 + 2 * 4410 – 2
* 0 = 8820
Ordering (1-3-2)
cont (A1, A3, A2) = 2 bond(A1, A3) +
2 bond(A3, A2) 2bond(A1, A2)
bond(A1, A3) = bond(A3, A1) = 4410
bond(A3, A2) =0+400+265+225= 890
bond(A1, A2) =0+0+45*5= 225
cont(A1, A3, A2) = 2*4410+2*890–2*225
= 8820 + 1780 – 450 = 10150
Ordering (234)
cont (A2, A3, A4) = 2 bond(A2, A3) +
2 bond(A3, A4) 2bond(A2, A4)
bond(A2, A3) = 890
bond(A3, A4) = 0
bond(A2, A4) = 0
cont (A2, A3, A4) = 2 * 890 + 0 + 0 = 1780
Ordering (031) = 8820
Ordering (132) = 10150
Ordering (234) = 1780
Ordering (031) = 8820
Ordering (132) = 10150
Ordering (234) = 1780
AA
A1
A1 45
A2 0
A2
0
A3
45
A4
0 A1
80 5 75 A2
A3 45 5 53 3 A3
A4 0 75 3 78 A4
CA
A1 A3 A2
45 45 0
0 5 80
45
0
53
3
5
75
Compute the contribution by placing
4th attribute at different places in CA
• We need to work out
–Ordering (0-4-1)
–Ordering (1-4-3)
–Ordering (3-4-2)
–Ordering (2-4-5)
• We need to work out
–Ordering (0-4-1)
–Ordering (1-4-3)
–Ordering (3-4-2)
–Ordering (2-4-5)
A1
A2
A3
A4
A1
45
A3
45
0
5
45 53
0
3
A2
0
A4
0
80 75
5
3
75 78
•Columns order changed
•Rows still in same order
•We switch the order of the rows accordingly
•BEA results following CA
•Note the clusters
A1
A3
A2
A4
A1 A3 A2
45 45 0
45 53 5
0
5
80
0
3
A4
0
3
75
75 78
Clustering Summary
• We need AUM that
reflects the QueryAttribute relationship
• AUM and FM are used
to make AA
• Global Affinity Measure
is used to establish the
clusters of attributes
• Stronger affinities
attributes and weaker
ones are grouped in CA
Partitioning
The objective is to
establish attributes
that are generally
accessed together
TA
TB
Define
• TQ = set of applications
that access only TA
• BQ = set of applications
that access only BA
• OQ = set of applications
that access both TA and
BA
• CTQ = number of accesses
to attributes by applications
that access only TA
• CBQ = number of accesses
to attributes by applications
that access only BA
• COQ = number of accesses
to attributes by applications
that access both TA and BA
• Then find the point z
along the diagonal
that maximizes
z = CTQ *
CBQ - COQ
2