Tải bản đầy đủ (.pdf) (32 trang)

Distributed Database Management Systems: Lecture 20

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (385.3 KB, 32 trang )

Distributed Database
Management Systems
Lecture 20


In the Previous Lecture

• Continued with VF
–Computed CA
–Partitioning Algorithm


In this Lecture
• Continue with VF
–Hybrid Fragmentation
–Allocation Problem
–Replication


A1

A3

A2

A4

A1

45 45 0


0

A3

45 53 5

3

A2

0

5

A4

0

3 75 78
CA

80 75

A1 A2 A3 A4

S1 S2 S3

q1 1

0


1

0

q1 15 20 10

q2 0

1

1

0

q2 5

q3 0

1

0

1

q3 25 25 25

q4 0

0 1 1

refj(qi)

z2 = 3311
z1 = 0 – 452
2
z3= 0 ­ 78

0

0

q4 3 0 0
accj(qi)


A1= jNo A2= jName
A3= budget A4= loc
V1 = {jNo, budget}
V2 = {jNo, jName, loc}


VF- Two Problems
1- Clusters not in the
sides, rather in the
middle of CA
2- m-way partitioning


VF Correctness



• A relation R, defined over
attribute set A and key K,
generates the vertical
partitioning
• FR = {R1, R2 , …, Rr }
• Completeness: The
following should be true
for A
A =U Ri


• Reconstruction: can be
achieved by
R = ⋈K Ri, ∀Ri ∈ FR
• Disjointness: TID's are not
considered to be
overlapping since they are
maintained by the system
• PK is exception


Hybrid
Fragmentation


Practically,
applications require
the fragmentation of
both the types to be

combined


So the nesting of
fragmentations, i.e.,
one following the
other, it becomes
sort of a tree


• Disjoint ness and
completeness have to
be assured at each
step, and
reconstruction can be
obtained by applying
Join and Union in
reverse order


CUST

Beta =  ΠA/C#, Bal (CUST)

A/C#
AB101
AB202
AB203
AB109


Name
Saeed
Laeeq
Salma
Shaan

Bal
4535
45632.34
67839.87
45.32

Branch
MTN
LHR
LHR
MTN

Delta1 = σ Loc = “MTN” (ΠA/C#, Name, Branch (CUST))
Delta2 = σ Loc = “LHR” (ΠA/C#, Name, Branch (CUST))

Beta
Delta1

Delta2

A/C#

Bal


AB101

4535

A/C#

Name

Branch

A/C#

Name

Branch

AB202

45632.34

AB101

Saeed

MTN

AB202

Laeeq


LHR

AB203

67839.87

AB109

Shaan

MTN

AB203

Salma

LHR

AB109

45.32


Allocation


• Given F = {F1, F2 , …, Fn}
fragments
• S ={S1 , S2 , …, Sm}
network sites

• Q = {q1, q2 ,…, qq }
applications

• Find the "optimal"
distribution of F to S.


• Optimality
–Minimize the processing
cost and maximize the
system throughput at
each site


It is a complex problem
to be solved
mathematically, to make
the things very simple,
consider the allocation of
a single fragment Fk,


• set of read only
queries on Fk from Si;
T = {t1, t2, …, tm}
• set of update queries
U on Fk from Si;
U= {u1, u2, .., um}



Communication Cost
C(T) = {c1,2, c1,3, …., c1,m,
….cm-1, m}
C’(T) = {c’1,2, c’1,3, …., c’1,m,
….c’m-1, m}
Storage Cost
D = {d1, d2, ……., dm}


Allocation problem is
to find the cites out
of set of sites S,
where the copy of Fk
will be stored.


1 if the fragment Fk is assigned to site Sj

xj =

0 otherwise

The specification of the
allocation problem will be
min 

m
i 1

xjujc ' ij

j|Sj I

tj min cij
j|sj I

xjdj
j|Sj I


• That concludes our
discussion on
Fragmentation
• Lets summarize it


• Fragmentation is
splitting a table into
smaller tables
• Alternatives
– Horizontal
–Vertical
–Hybrid


Horizontal
Fragmentation


×