Query optimization
Lecturer:
Assoc. Prof. Dr. DANG Tran Khanh
Report:
13070243 Trần Duy Linh
13070263 Nguyễn Minh Thành
Outline
Introduction to Query Processing
Translating SQL Queries into Relational Algebra
Rules for equivalent RAEs
Using Heuristics in Query Optimization
Cost-based query optimization
Summary
2
Outline
Introduction to Query Processing
Translating SQL Queries into Relational Algebra
Rules for equivalent RAEs
Using Heuristics in Query Optimization
Cost-based query optimization
Summary
3
Introduction to Query Processing
Query processing:
The process by which the query results are
retrieved from a high-level query such as SQL or
OQL, ODBMS
Query optimization:
The process of choosing a suitable execution
strategy for processing a query.
Two internal representations of a query:
Query Tree
Query Graph
4
Processing a high-level query
Scanning, parsing, and validating
Query optimizer
Query code generator
Runtime database processor
Query in a high-level language
Code to execute the query
Execution plan
Immediate form of query
Result of query
5
Example
Example of query optimazation
6
Outline
Introduction to Query Processing
Translating SQL Queries into Relational Algebra
Rules for equivalent RAEs
Using Heuristics in Query Optimization
Cost-based query optimization
Summary
7
Translating SQL Queries into Relational Algebra
8
SELECT *
FROM R
WHERE c
σC(R)
SELECT A1, A2,
…
FROM R
Π A1, A2, …
(R)
SELECT *
FROM R, S
WHERE c
R c
S
Translating SQL Queries into Relational Algebra
Query block: the basic unit that can be translated
into the algebraic operators and optimized.
A query block contains a single SELECT-FROM-
WHERE expression, as well as GROUP BY and
HAVING clause if these are part of the block.
Nested queries within a query are identified as
separate query blocks.
Aggregate operators (MAX, MIN, SUM, and COUNT)
in SQL must be included in the extended algebra.
9
Translating SQL Queries into Relational Algebra
10
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > (SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);
SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > C
πLNAME, FNAME
(σSALARY>C(EMPLOYEE))
ℱMAX SALARY (σDNO=5
(EMPLOYEE))
Outline
Introduction to Query Processing
Translating SQL Queries into Relational Algebra
Query Trees and Query Graphs
Rules for equivalent RAEs
Using Heuristics in Query Optimization
Cost-based query optimization
Summary
11
Query Trees and Query Graphs
Query tree:
A tree data structure that corresponds to a relational algebra
expression.
It represents the input relations of the query as leaf nodes of
the tree, and represents the relational algebra operations as
internal nodes.
An execution of the query tree consists of executing an internal node
operation whenever its operands are available and then replacing that
internal node by the relation that results from executing the operation.
Query graph:
A graph data structure that corresponds to a relational
calculus expression.
It does not indicate an order on which operations to perform
first. There is only a single graph corresponding to each
query.
12
Query Trees and Query Graphs
Example:
EMPLOYEE
DEPARTMENT
PROJECT
13
Query Trees and Query Graphs
Example: For every project located in ‘Stafford’, retrieve the project number, the
controlling department number and the department manager’s last name,
address and birthdate.
Relation algebra:
πPNUMBER, DNUM, LNAME, ADDRESS, BDATE
(((σPLOCATION=‘STAFFORD’(PROJECT))
DNUM=DNUMBER (DEPARTMENT)) MGRSSN=SSN (EMPLOYEE))
SQL query:
Q2: SELECT P.NUMBER,P.DNUM,E.LNAME,
E.ADDRESS, E.BDATE
FROM PROJECT AS P,DEPARTMENT AS D, EMPLOYEE AS E
WHERE P.DNUM=D.DNUMBER AND D.MGRSSN=E.SSN AND
P.PLOCATION=‘STAFFORD’;
14
Query Trees and Query Graphs
15
Query Trees and Query Graphs
16
Outline
Introduction to Query Processing
Translating SQL Queries into Relational Algebra
Rules for equivalent RAEs
Using Heuristics in Query Optimization
Cost-based query optimization
Summary
17
Equivalent Relational Expressions
Two Relational Algebra Expressions are
equivalent if they produce the same results
(tuples) on the same input relations
- Although their tuples/attributes may
be ordered differently.
An equivalent rule says that expressions of
two forms are equivalent
Can replace expression of first form by
second, or vice versa
18
Rules for equivalent RAEs
19
1. Cascade of σ A conjunctive selection
condition can be broken up into a
cascade (that is, a sequence) of
individual σ operations :
(R)) ))( ( σ(σσ(R)σ
n21n21
θθθ θθθ
≡
∧∧
Rules for equivalent RAEs
20
2. Commutativity of σ. The σ operation is
commutative:
3. Cascade of π:
(R))(σσ(R))(σσ
1221
θθθθ
≡
n
L
1n
L
2
L
1
L
LLLL
)(R) ))(R))(( (
1n21
⊂
−
⊂⊂⊂
∏≡∏∏∏
Rules for equivalent RAEs
21
4. Commuting σ with π:
5. Commutativity of (and ×)
(R))(σ(R))(σ
n2, ,1n2, ,1
AA,AθθAA,A
∏≡∏
RS SR
θ θ
≡
SxR S x R
≡
Rules for equivalent RAEs
22
6. Commuting σ with (or x )
a.
b.
c.
When θ involves only the attributes of
R
θ1 in R , θ2 in S
θ1 in R , θ2 in S, θ involves attributes in both R
and S
SS (R))(σ)(Rσ
θθ
≡
S))(σR))(σ()(Rσ
2121
θθθθ
(S
≡
∧
S))(σR)(σ(σ)(Rσ
2121
θθθθθθ
S
≡
∧∧
Rules for equivalent RAEs
23
7. Commuting π with (or x )
L = L1 ∪ L2, L1 in R, L2 in S
θ only involves attributes in L
L = L1 ∪ L2, θ not in L, θ = L3 ∪ L4
L3 in R, L4 in S
a.
b.
))S(()R)(()(R
21
L
θ
L
θ
L
S
∏∏≡∏
))S(())R(()(R
4
231
LL
θ
LLL
θ
L
S
∏∏∏≡∏
Example rule 7
24
Ex : R(ACD), S(BEF)
R
S
R S
))((R)()R(
SS
BE
EC
ACBA
EC
BA
∏∏∏≡∏
>>
BA
∏
EC
>
BA
∏
EC
>
AC
∏
EB
∏
8. Commutativity of set operations
9. Associativity of ( X , , and ∩) ∪
25
Rules for equivalent RAEs
)RS() R(
S
≡
)RS() R(
S
≡
) TS(RT) R(
S
≡