7 Query Localization
7 Query Localization
DUR=12 OR DUR=24
PNAME=CAD/CAM
ENAMEJ. DOE
PROJ ASG EMP
Project
Select
Join
PNO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/10
Restructuring Transformation
Rules
Commutativity of binary operations
R S S R
R S S R
R S S R
Associativity of binary operations
( R S) T R (S T)
(R S) T R (S T)
Idempotence of unary operations
H
A
(H
A
(R)) H
A
(R)
o
p
1
(A
1
)
(o
p
2
(A
2
)
(R)) o
p
1
(A
1
).p
2
(A
2
)
(R)
where R[A] and A' _ A, A" _ A and A' _ A"
Commuting selection with projection
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/11
Restructuring Transformation
Rules
Commuting selection with binary operations
o
p(A)
(R S) (o
p(A)
(R)) S
o
p(A
i
)
(R
(A
j
,B
k
)
S) (o
p(A
i
)
(R))
(A
j
,B
k
)
S
o
p(A
i
)
(R T) o
p(A
i
)
(R) o
p(A
i
)
(T)
where A
i
belongs to R and T
Commuting projection with binary operations
H
C
(R S) H
A
(R) H
B
(S)
H
C
(R
(A
j
,B
k
)
S) H
A
(R)
(A
j
,B
k
)
H
B
(S)
H
C
(R S) H
C
(R) H
C
(S)
where R[A] and S[B]; C = A' B' where A' _ A, B' _ B
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/12
Example
Recall the previous example:
Find the names of employees other
than J. Doe who worked on the
CAD/CAM project for either one or
two years.
SELECT ENAME
FROM PROJ, ASG, EMP
WHERE ASG.ENO=EMP.ENO
AND ASG.PNO=PROJ.PNO
AND ENAME "J. Doe"
AND PROJ.PNAME="CAD/CAM"
AND (DUR=12 OR DUR=24)
H
ENAME
o
DUR=12 . DUR=24
o
PNAME=CAD/CAM
o
ENAMEJ. DOE
PROJ ASG EMP
Project
Select
Join
PNO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/13
Equivalent Query
H
ENAME
o
PNAME=CAD/CAM . (DUR=12 . DUR=24) .ENAMEJ. Doe
PROJ
ASG
EMP
PNO,ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/14
EMP
H
ENAME
o
ENAME
"J. Doe"
ASG PROJ
H
PNO,ENAME
o
PNAME
=
"CAD/CAM"
H
PNO
o
DUR
=12.DUR=24
H
PNO,ENO
H
PNO,ENAME
Restructuring
PNO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/15
Step 2 Data Localization
Input: Algebraic query on distributed relations
Determine which fragments are involved
Localization program
substitute for each global query its materialization program
optimize
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/16
Example
Assume
EMP is fragmented into EMP
1
, EMP
2
,
EMP
3
as follows:
EMP
1
= o
ENOE3
(EMP)
EMP
2
= o
E3<ENOE6
(EMP)
EMP
3
= o
ENOE6
(EMP)
ASG fragmented into ASG
1
and ASG
2
as follows:
ASG
1
= o
ENOE3
(ASG)
ASG
2
= o
ENO>E3
(ASG)
Replace EMP by (EMP
1
EMP
2
EMP
3
)
and ASG by (ASG
1
ASG
2
) in any query
H
ENAME
o
DUR=12 .DUR=24
o
PNAME=CAD/CAM
o
ENAMEJ. DOE
PROJ
EMP
1
EMP
2
EMP
3
ASG
1
ASG
2
PNO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/17
Provides Parallellism
EMP
3
ASG
1
EMP
2
ASG
2
EMP
1
ASG
1
EMP
3
ASG
2
ENO
ENO
ENO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/18
Eliminates Unnecessary Work
EMP
2
ASG
2
EMP
1
ASG
1
EMP
3
ASG
2
ENO
ENO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/19
Reduction for PHF
Reduction with selection
Relation R and F
R
={R
1
, R
2
, , R
w
} where R
j
=o
p
j
(R)
o
p
i
(R
j
)=C if x in R: (p
i
(x). p
j
(x))
Example
SELECT *
FROM EMP
WHERE ENO="E5"
o
ENO=E5
EMP
1
EMP
2
EMP
3
EMP
2
o
ENO=E5
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/20
Reduction for PHF
Reduction with join
Possible if fragmentation is done on join attribute
Distribute join over union
(R
1
R
2
)S (R
1
S) (R
2
S)
Given R
i
=o
p
i
(R) and R
j
= o
p
j
(R)
R
i
R
j
=C if x in R
i
, y in R
j
: (p
i
(x). p
j
(y))
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/21
Reduction for PHF
Assume EMP is fragmented as
before and
ASG
1
: o
ENO "E3"
(ASG)
ASG
2
: o
ENO > "E3"
(ASG)
Consider the query
SELECT *
FROM EMP,ASG
WHERE EMP.ENO=ASG.ENO
Distribute join over unions
Apply the reduction rule
EMP
1
EMP
2
EMP
3
ASG
1
ASG
2
ENO
EMP
1
ASG
1
EMP
2
ASG
2
EMP
3
ASG
2
ENO
ENO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/22
Reduction for VF
Find useless (not empty) intermediate relations
Relation R defined over attributes A = {A
1
, ..., A
n
} vertically fragmented
as R
i
=H
A'
(R) where A'_ A:
H
D,K
(R
i
) is useless if the set of projection attributes D is not in A'
Example: EMP
1
=H
ENO,ENAME
(EMP); EMP
2
=H
ENO,TITLE
(EMP)
SELECT ENAME
FROM EMP
EMP
1
EMP
1
EMP
2
H
ENAME
ENO
H
ENAME
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/23
Reduction for DHF
Rule :
Distribute joins over unions
Apply the join reduction for horizontal fragmentation
Example
ASG
1
: ASG
ENO
EMP
1
ASG
2
: ASG
ENO
EMP
2
EMP
1
: o
TITLE=Programmer
(EMP)
EMP
2
: o
TITLE=Programmer
(EMP)
Query
SELECT *
FROM EMP, ASG
WHERE ASG.ENO = EMP.ENO
AND EMP.TITLE = "Mech. Eng."
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/24
Generic query
Selections first
Reduction for DHF
ASG
1
o
TITLE=Mech. Eng.
ASG
2
EMP
1
EMP
2
ASG
1
ASG
2
EMP
2
o
TITLE=Mech. Eng.
ENO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/25
Joins over unions
Reduction for DHF
Elimination of the empty intermediate relations
(left sub-tree)
ASG
1
EMP
2
EMP
2
o
TITLE=Mech. Eng.
ASG
2
o
TITLE=Mech. Eng.
ASG
2
EMP
2
o
TITLE=Mech. Eng.
ENO
ENO
ENO
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/26
Reduction for Hybrid
Fragmentation
Combine the rules already specified:
Remove empty relations generated by contradicting selections on horizontal
fragments;
Remove useless relations generated by projections on vertical fragments;
Distribute joins over unions in order to isolate and remove useless joins.
Distributed DBMS
M. T. zsu & P. Valduriez
Ch.7/27
Reduction for HF
Example
Consider the following hybrid
fragmentation:
EMP
1
= o
ENO"E4"
(H
ENO,ENAME
(EMP))
EMP
2
= o
ENO>"E4"
(H
ENO,ENAME
(EMP))
EMP
3
= o
ENO,TITLE
(EMP)
and the query
SELECT ENAME
FROM EMP
WHERE ENO="E5" EMP
1
EMP
2
EMP
3
o
ENO=E5
H
ENAME
EMP
2
o
ENO=E5
H
ENAME
ENO