Distributed Database
Management Systems
Lecture 31
In the previous lecture
• Basic Concepts of Query
Optimization
• QP in centralized and
Distributed DBs.
In this Lecture
• Query Decomposition
• Its Phases
• ……..
Steps in Query Processing
SQL Query on Distributed Relations
QUERY
DECOMPOSITION
Algebraic Query on Distributed
Relations
DATA
LOCALIZATION
GLOBAL
SCHEMA
FRAGMENT
SCHEMA
Fragment Query
GLOBAL
OPTIMIZATION
Optimized Fragment Query with
Communication Operations
LOCAL
OPTIMIZATION
Optimized Local Query
STAT OF
FRAGMENTS
LOCAL
SCHEMA
Query Decomposition
• Transforms an SQL
(relational calculus) query
into relational algebra
query
• Both are on global
relations in DDBS-
Steps in QD
1. Normalization
2. Analysis
3. Simplification
4. Rewriting
1- Normalization
• Input Query can be
complex
• Lexical and syntactic
Analysis (like compilers)
• Treatment of WHERE
Clause
• Two possible forms-
• Conjunctive NF–(p11vP12v….vp1n) ^ …
^(pm1vpm2v….vpmn)
• Disjunctive NF
–(p11^P12^….^p1n) v…
v(pm1^pm2^….^pmn)
• Transformation is based
on equivalence rules
Equivalence Rules
• p1 ^ p2 p2 ^ p1
• p1 v p 2 p 2 v p 1
• (p1 ^ p2) ^ p3 p1 ^ (p2 ^ p3)
• (p1 v p2) v p3 p1 v ( p2 v p3)
• p1 ^ (p2v p3) (p1 ^ p2)v(p1 ^ p3)
• p1 v (p2 ^ p3) (p1 v p2)^(p1 v p3)
V into U and ^ into ⋈ or
Example
• SELECT eName
FROM EMP, ASG
WHERE EMP.eNo = ASG.eNo
AND
ASG.pNo = ‘P1’
AND
dur = 12 OR dur = 24
• Qualification in Con NF
EMP.eNo = ASG.eNo ^
ASG.pNo = ‘P1’ ^
(dur = 12 v dur = 24)
• Qualification in Dis NF
(EMP.eNo = ASG.eNo
^ASG.pNo = ‘P1’ ^ dur = 12)
v
(EMP.eNo = ASG.eNo
^ASG.pNo = ‘P1’ ^ dur = 24)
2- Analysis
• Reject incorrect ones
• Type incorrect
–Relations/attributes not
exist
–Wrong operations.
• Semantically Incorrect
–Components do not
contribute in result
–Detection possible in
certain cases; not contain
disjunction or negation
–Query graph and join graph
Query Graph
• Nodes represent result
or operand relations
• Links represent joins or
projection (result node)
• Self join/select on
operand nodes
Example
• Select eName, resp
FROM
EMP, ASG, PROJ
WHERE
EMP.eNo = ASG.eNo
AND
ASG.pNo = PROJ.pNo
AND
pName = ‘CAD/CAM’
AND
dur ≥ 36
AND
title = ‘Programmer’
Query Graph
Dur ≥ 36
EMP.eNo =
ASG.eNo
EMP
Title =
‘Programmer’
ASG
resp
ASG.pNo =
PROJ.pNo
PROJ
pName =
‘CAD/CAM’
eNeme
RESULT
Join Graph
EMP.eNo =
ASG.eNo
EMP
ASG
ASG.pNo =
PROJ.pNo
PROJ
Semantically Incorrect
• Select eName, resp
FROM
WHERE
AND
AND
AND
EMP, ASG, PROJ
EMP.eNo =ASG.eNo
pName = ‘CAD/CAM’
dur ≥ 36
title = ‘Programmer’
Query Graph
Dur ≥ 36
EMP.eNo =
ASG.eNo
EMP
Title =
‘Programmer’
ASG
resp
PROJ
pName =
‘CAD/CAM’
eNeme
RESULT
3- Elimination of Redundancy
• Expression replacement
already used in views
• User mistake or this
replacement may contain
redundant predicates
• Simplification on idempotency
rules
•p^p p
• p ^ true
• p ^ false
• p v true
•p^
pvp p
p
p v false
false
true
p false
• p1 ^ (p1 v p2) p1
• p1 v (p1 ^ p2) p1
pv
p
p
true
Example
• Select
title
FROM
EMP
WHERE
(title = ‘Prog’
AND ((not (title = ‘Prog’))
OR
title = ‘Elect. Engr’)
AND (not (title = ‘Elect. Engr’)))
OR
eName = ‘Saleem’
p1: title = ‘Prog’
p2: title = ‘Elect. Engr’
p3: eName = ‘Saleem’
p1 ^ ( p1 v p2) ^ p2) v p3
p1^(( p1 ^ p2) v (p2 ^ p2))vp3
(p1^ p1 ^ p2) v (p1^p2 ^ p2)vp3
(false ^ p2) v (p1 ^ false) v p3
(false) v (false) v p3 = p3
After Simplification
• Select
FROM
WHERE
title
EMP
eName = ‘Saleem’
4- Rewriting
• Transforming SQL to
Relational Algebra
straightaway
• Restructuring operators to
improve efficiency
• Operator Tree is used