Distributed DBMS - Unit - 4 - Data Distribution Alternatives
1. Unit – 4
Data Distribution Alternatives
Fragmentation
2. Horizontal Fragmentation
• Basic Requirement of Horizontal Fragmentation
1. Find out simple predicate Pr
2. Find out Minterm predicates M
3. Minterm selectivity sel(mi)
4. Access Frequencies acc(qi)
5. Access Frequencies Minterm acc(qi)
1/11/2017 2Prof. Dhaval R. Chandarana
3. Fragmentation Examples
• Example PROJECT table:
• PROJ:
PNO PNAME BUDGET LOC
P1 Instrumentation 150000 Montreal
P2 Database Develop 135000 New York
P3 CAD/CAM 250000 New York
P4 Maintenance 310000 Paris
1/11/2017 3Prof. Dhaval R. Chandarana
4. Predicates
• Predicates
• Appear in the WHERE clause of a query
• Important determiner of fragmentation
• Determine the composition of table fragments
• Let R(A1, A2, …, An) be a relation
• Ai is defined over a domain Di
• We say pi is a simple predicate if it is of the form
• Pi ʘ Value where ʘ ԑ { =, <, >, , , <> }
• Examples:
• PNAME = 'CAD/CAM'
• BUDGET > 200000
1/11/2017 4Prof. Dhaval R. Chandarana
5. Predicates
• Usually, multiple predicates are necessary to describe a selection of
rows in a relation
• Most Boolean combinations can be translated into conjunctive normal
form
• p1 ^ p2 ^ …^ pk
• We attempt to fragment tables according to selection (WHERE clause)
patterns
• A combination of predicates in conjunctive normal form is a Minterm
• Let a set of predicates on a relation be:
Pr = {p1, p2, …, pk }
1/11/2017 5Prof. Dhaval R. Chandarana
6. Minterms
• Let set of minterm predicates be
M = { m1, m2, …, mz }
where M = {mj | mj = ^(pn ԑ Pr) pn}
• Some property equivalences:
• For equality: !(attr = val) = (attr <> val)
• For inequality: !(attr > val) = (attr val)
• It is not necessary to duplicate predicates
• In minterms, one is sufficient
1/11/2017 6Prof. Dhaval R. Chandarana
7. Minterm Examples
• p1: LOC = 'Montreal'
• p2: LOC = 'New York'
• p3: LOC = 'Paris'
• p4: BUDGET > 200000
• p5: BUDGET <= 200000
• m1: LOC = 'New York' ^ BUDGET > 200000
• m2: LOC = 'New York' ^ BUDGET <= 200000
• m3: LOC = 'Paris' ^ BUDGET > 200000
• m4: LOC = 'Paris' ^ BUDGET <= 200000
• m5: LOC = 'Montreal' ^ BUDGET > 200000
• m6: LOC = 'Montreal ' ^ BUDGET <= 200000
1/11/2017 7Prof. Dhaval R. Chandarana
8. Minterm Properties
• Minterm selectivity
• Number of records that satisfy minterm
• sel(m1) = 1; sel(m2) = 1; sel(m4) = 0
• Access frequency by applications and users
• Q = {q1, q2, …, qq} is set of queries
• acc(q1) is frequency of access of query 1
1/11/2017 8Prof. Dhaval R. Chandarana
9. Primary Horizontal Fragmentation
• Using minterms and access frequency, one can generate a horizontal
fragmentation
• Suppose there are w fragments
• Then each relation fragment Ri is given by a formula Fi, where each
formula represents a minterm expression of predicates
Ri = ϭ Fi(R) where 1 <= i <= w
• Examples:
• PROJ1 = BUDGET<=200000 (PROJ)
• PROJ2 = BUDGET>200000 (PROJ)
1/11/2017 9Prof. Dhaval R. Chandarana
10. Algorithm for Determining Minterms
• Rule 1: fragment is partitioned into at least two parts that are
accessed differently by at least one application
• Definitions
• R - relation
• Pr - set of simple predicates
• Pr' - another set of simple predicates
• F - set of minterm fragments
1/11/2017 10Prof. Dhaval R. Chandarana
11. Algorithm for Determining Minterms
• Define set of inferences from the predicates
• Assume val1 and val2 are complimentary and complete the set of values:
• p1: att = val1
• p2: att = val2
• i1: (att = val1) => !(att = val2)
• i2: (att = val2) => !(att = val1)
• set of possible minterms
• m1: (att = val1) ^ (att = val2)
• m2: (att = val1) ^ !(att = val2)
• m3: !(att = val1) ^ (att = val2)
• m4: !(att = val1) ^ !(att = val2)
• m1 and m4 cannot be minterms because they contradict inferences
1/11/2017 11Prof. Dhaval R. Chandarana
12. Calculate Minterms for Table
PHORIZONTAL {
Pr' = COM_MIN(R, Pr)
determine set of minterms M
determine inference set I among Pr'
eliminate contradictory mi's according to I from M
eliminate subsumed minterms
what is left in M is horizontal fragmentation
}
1/11/2017 12Prof. Dhaval R. Chandarana
13. Example
Step 1: Identify relevant predicates
• p1: LOC = 'Montreal'
• p2: LOC = 'New York'
• p3: LOC = 'Paris'
• p4: BUDGET > 200000
• p5: BUDGET <= 200000
1/11/2017 13Prof. Dhaval R. Chandarana
14. Define Full Minterm Set
• m1: LOC = ‘Montreal’
• m2: LOC = ‘New York’
• m3: LOC = ‘Paris’
• m4: BUDGET > 200000
• m5: BUDGET <= 200000
• m6: LOC = ‘Montreal’ ^ LOC = ‘New York’
• m7: LOC = ‘Montreal’ ^ LOC = ‘Paris’
• m8: LOC = ‘Montreal’ ^ BUDGET > 200000
• m9: LOC = ‘Montreal’ ^ BUDGET <= 200000
• m10: LOC = ‘New York’ ^ LOC = ‘Paris’
• m11: LOC = ‘New York’ ^ BUDGET > 200000
• m12: LOC = ‘New York’ ^ BUDGET <= 200000
1/11/2017 14Prof. Dhaval R. Chandarana
15. Define Full Minterm Set
• m13: LOC = ‘Paris’ ^ BUDGET > 200000
• m14: LOC = ‘Paris’ ^ BUDGET <= 200000
• m15: BUDGET > 200000 ^ BUDGET <= 200000
• m16: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ LOC = ‘Paris’
• m17: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ BUDGET > 200000
• m18: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ BUDGET <= 200000
• m19: LOC = ‘Montreal’ ^ LOC = ‘Paris’ ^ BUDGET > 200000
• m20: LOC = ‘Montreal’ ^ LOC = ‘Paris’ ^ BUDGET <= 200000
• m21: LOC = ‘Montreal’ ^ BUDGET > 200000 ^ BUDGET <= 200000
• m22: LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET > 200000
• m23: LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET <= 200000
• m24: LOC = ‘New York’ ^ BUDGET > 200000 ^ BUDGET <= 200000
• m25: LOC = ‘Paris’ ^ BUDGET > 200000 ^ BUDGET <= 200000
1/11/2017 15Prof. Dhaval R. Chandarana
16. Define Full Minterm Set
• m26: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET >
200000
• m27: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET <=
200000
• m28: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ BUDGET > 200000 ^ BUDGET
<= 200000
• m29: LOC = ‘Montreal’ ^ LOC = ‘Paris’ ^ BUDGET > 200000 ^ BUDGET <=
200000
• m30: LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET > 200000 ^ BUDGET <=
200000
• m31: LOC = ‘Montreal’ ^ LOC = ‘New York’ ^ LOC = ‘Paris’ ^ BUDGET >
200000 ^ BUDGET <= 200000
1/11/2017 16Prof. Dhaval R. Chandarana
17. Define Inferences
• Inferences:
• p1 => ~p2 p3 => ~p1
• p1 => ~p3 p3 => ~p2
• p2 => ~p1 p4 => ~p5
• p2 => ~p3 p5 => ~p4
• Left with only:
• m1: LOC = ‘Montreal’ m8: LOC = ‘Montreal’ ^ BUDGET > 200000
• m2: LOC = ‘New York’ m9: LOC = ‘Montreal’ ^ BUDGET <= 200000
• m3: LOC = ‘Paris’ m12: LOC = ‘New York’ ^ BUDGET <= 200000
• m4: BUDGET > 200000 m13: LOC = ‘Paris’ ^ BUDGET > 200000
• m5: BUDGET <= 200000 m14: LOC = ‘Paris’ ^ BUDGET <= 200000
• After subsumption, only m8, m9, m11, m12, m13, m14 remain
1/11/2017 17Prof. Dhaval R. Chandarana
18. Actual Partitions
• The four actual partitions are: m9, m11, m12, m13
• The two partitions m8 and m14 have no data
PNO PNAME BUDGET LOC
P1 Instrumentation 150000 Montreal
P2 Database Develop 135000 New York
P3 CAD/CAM 250000 New York
P4 Maintenance 310000 Paris
1/11/2017 18Prof. Dhaval R. Chandarana