3. Query Processing
3
Introduction (Cont.)
SELECT *
FROM TABLE-1, TABLE-2
WHERE TABLE-1 . ID = TABLE-2 . ID
σTABLE-1 . ID = TABLE-2 . ID (TABLE-1 X TABLE-2)
Relational
algebra
generator
Query parser
(Syntax and
semantic
analyzer)
Query processor gets a user
query in structured query
language (SQL), checks
query for syntax and semantic
errors, and generates its
equivalent expression in
relational algebra necessary for
data access.
Query Processor
4. Estimate
alternate
plan Estimate
alternate
plan
Data base
Query
Alternate query plan 1 Alternate query plan 2 Alternate query plan n. . .
Choose best plan among them
4
Q . P
The process of generating various execution plans for user query and finding
a best plan that takes less execution time and consumes less resources.
Query Optimizer
Introduction (Cont.)
Query Optimization
Query processor
5. 5
Generation of various execution plans and finding a better plan is
done by query optimization techniques (algorithms) implemented
in the query optimizer.
OPTIMIZATION TECHNIQUES
Simple execution plan
Eliminating Cartesian product with joins
Performing selection before join (Push selection)
Performing projection before join (Push projections)
.
.
.
Introduction (Cont.)
Query Optimization
6. 6
name
Students.id = Enrolled.id / Enrolled.grade=‘B’
Students Enrolled
Introduction (Cont.)
Query Optimization
SIMPLE QUERY EXECUTION PLAN: Query processor generates an equivalent
relational algebraic expression and forward it to the query optimizer. Usually, that
expression involves Cartesian product.
Example: Return names of students who earned grade ‘B’
x
id name age
E1 Ahmed 15
E2 Farhan 14
id reg_nr grade
E2 AB29 B
E4 AB30 D
Students
Enrolled
name ( Enrolled. grade = ‘B’ / Students.id = Enrolled.id ) Student x Enrolled
7. 7
name
Students.id = Enrolled.id / Enrolled.grade=‘B’
Students Enrolled
Introduction (Cont.)
ELIMINATION OF CARTESIAN WITH JOIN: Cartesian product may be
replaced with joins.
Example: name ( Enrolled. grade = ‘B’ / Students.id = Enrolled.id ) Student x Enrolled
x
Query Optimization
name
Enrolled.grade=‘B’
Students Enrolled
Students.id = Enrolled.id⋈
8. 8
name
Students.id= Enrolled.id /
Enrolled.grade=‘B’
Students Enrolled
Introduction (Cont.)
PERFORMING SELECTION BEFORE JOIN (PUSH SELECTION):
By pushing selection operator down makes them executes as early as possible
x
Query Optimization
name
Enrolled.grade=‘B’
Students Enrolled
Students.id = Enrolled.id
Students.id = Enrolled.id
name
Enrolled.grade=‘B’Students
Enrolled
Example: name ( Enrolled. grade = ‘B’ / Students.id = Enrolled.id ) Student x Enrolled
⋈
⋈
9. Query: Return names of students who earned grade ‘B’
9
Students Enrolled
id Name
11 Ali
22 Ahmed
33 Saima
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
Name (Students . id = Enrolled . id / Enrolled . Grade = ‘B’ ( Students X Enrolled)
SELECT Name
FROM Students , Enrolled
WHERE Students.id= Enrolled.id
AND Enrolled . Grade=‘B’
Results
10. Query: Return names of students who earned grade ‘B’
SELECT Name
FROM Students , Enrolled
WHERE Students.id= Enrolled.id
AND Enrolled . Grade=‘B’
10
Students Enrolled
id Name
11 Ali
22 Ahmed
33 Saima
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
Name ( Enrolled . Grade = ‘B’ ( Students (Enrolled))⋈ Students . id = Enrolled . id
Results (Cont.)
11. Query: Return names of students who earned grade ‘B’
11
Students Enrolled
id Name
11 Ali
22 Ahmed
33 Saima
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
SELECT Name
FROM Students , Enrolled
WHERE Students.id= Enrolled.id
AND Enrolled . Grade=‘B’
Name (Students (Enrolled . Grade = ‘B’ (Enrolled))⋈ Students . id = Enrolled . id
Results (Cont.)
13. 13
Students.id=Enrolled.id
^ Enrolled.Grade=‘B’
Students
Name
X
Enrolled
Name
Enrolled.Ggrade=‘B’
Students Enrolled
Students.id=Enrolled.id
Students.id=Enrolled.id
Name
Enrolled.Grade=‘B’Students
Enrolled
⋈
⋈
Execution time:
0.2804756 ms
Execution time :
0.0707199 ms
Execution time :
0.060316 ms
Comparing and contrasting proposed optimization strategies based
on execution time.
Results (Cont.)
14. 14
id Name
11 Ali
22 Ahmed
33 Saima
Students
Enrolled
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
Name (Students . id = Enrolled . id / Enrolled . Grade = ‘B’ ( Students X Enrolled)
id Name id C.no Grade
11 Ali 11 CS20 A
11 Ali 22 CS21 B
11 Ali 33 CS22 B
22 Ahmed 11 CS20 A
22 Ahmed 22 CS21 B
22 Ahmed 33 CS22 B
33 Saima 11 CS20 A
33 Saima 22 CS21 B
33 Saima 33 CS22 B
( Students X Enrolled)
Execution plan #1
Results (Cont.)
15. 15
id Name id C.no Grade
11 Ali 11 CS20 A
11 Ali 22 CS21 B
11 Ali 33 CS22 B
22 Ahmed 11 CS20 A
22 Ahmed 22 CS21 B
22 Ahmed 33 CS22 B
33 Saima 11 CS20 A
33 Saima 22 CS21 B
33 Saima 33 CS22 B
Sid Name Id C.no Grade
22 Ahmed 22 CS21 B
33 Saima 33 CS22 B
Students . id = Enrolled . id / Enrolled . Grade = ‘B’
Name
Ahmed
Saima
Name
Execution plan #1 (Cont.)
( Students X Enrolled)
Name ( Students . id = Enrolled . id / Enrolled . Grade = ‘B’ ( Students X Enrolled)
This query cannot be the best way to reach our answer. We
have cross product, which means that we create a table
whose number of rows is |Students|* | Enrolled |.
Results (Cont.)
16. 16
Students
Enrolled
id Name
11 Ali
22 Ahmed
33 Saima
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
Execution plan #2
Name ( Enrolled . Grade = ‘B’ ( Students (Enrolled))⋈ Students . id = Enrolled . id
id Name id C.no Grade
11 Ali 11 CS20 A
22 Ahmed 22 CS21 B
33 Saima 33 CS22 B
( Students (Enrolled))⋈ Students . id = Enrolled . id
Results (Cont.)
17. 17
Name ( Enrolled . Grade = ‘B’ ( Students (Enrolled))⋈ Students . id = Enrolled . id
Name
Ahmed
Saima
Name
Execution plan #2(Cont.)
id Name id C.no Grade
11 Ali 11 CS20 A
22 Ahmed 22 CS21 B
33 Saima 33 CS22 B
( Students (Enrolled))⋈ Students . id = Enrolled . id
id Name id C.no Grade
22 Ahmed 22 CS21 B
33 Saima 33 CS22 B
Takes less execution time as compared to 1st
plan because cross product is replaced by join
and the number of rows are decreased .
Enrolled . Grade = ‘B’
Results (Cont.)
18. 18
Enrolled
id C.no Grade
11 CS20 A
22 CS21 B
33 CS22 B
Name (Students (Enrolled . Grade = ‘B’ (Enrolled))⋈ Students . id = Enrolled . id
Enrolled . Grade = ‘B’
id C.no Grade
22 CS21 B
33 CS22 B
Execution plan #3
id Name
22 Ahmed
33 Saima
id C.no Grade
22 CS21 B
33 CS22 B
Students ⋈Students . id = Enrolled .id ( Enrolled)
Name
Ahmed
Saima
Name
Students
id Name
11 Ali
22 Ahmed
33 Saima
The earlier we process selections, less tuples we need to manipulate
higher up in the tree.
Results (Cont.)
19. Developed a tool that translates user queries (entered in SQL) into mathematical representation
(relational algebra) necessary for data access.
It has been proved that the developed tool is better at:
producing syntax and semantic errors,
generating query execution plans: (i) Simple, (ii) elimination of Cartesian product with join, and
(iii) push selection
allowing connectivity with databases of interest and produces query results
Query optimizer of the tool produces execution times of all three execution plans considered
Based on the obtained results, it is concluded that push-selection is better than the join and join is better
than Cartesian product in terms of execution times.
Tool may be used (in both academic and research institutes) for:
better understanding of how queries are actually processed and optimized.
the development of modern DBMS packages which will be more efficient in terms of execution time.
19
Conclusion
20. The measurement of the consumption of resources
Analysis of the tradeoff between execution time and
consumption of resources
Other optimization techniques can be implemented
20
Future Works
22. 22
1. Chaudhuri S. Mendelzon A. and Paredaens J. (2008) An overview of query
optimization in relational systems. In proceedings of the 17th ACM SIGACT-
SIGMOD-SIGART Symposium on Principles of Database Systems, isbn: 0-89791-
996-3.
2. Bernstein P. A., and Newcomer E. (2009) Principles of Transaction Processing.
isbn: 9780080948416.
3. Elmasri R., and NavathS. B. (2011) Fundamentals of database systems, isbn:
978-81-317-1625-0, Addison-Wesley Longman Publishing Co., Inc. 2011.
4. D. Kossmann (2000). The State of the Art in Distributed Query
Processing. ACM Computing Surveys, 32(4): 422-469.
5. Wesley W. Chu, Fellow, and Paul Hurley (1982) Optimal Query Processing for
Distributed Database Systems. IEEE Transaction on Computers, vol. C-31, no.
9, September 1982.
References
23. 23
6. W. Kim (1982) On optimizing an SQL-like nested query. ACM Transaction on
Database Systems, vol. 7, pages. 443-469, Sept. 1982.
7. Klug Anthony (2010) Equivalence of relational algebra and relational calculus
query languages having aggregate functions. Journal of ACM, Vol. 29, Number
3, isbn: 0004-5411, pages: 699-717.
8. Nica Anisoara, Charlesworth Ian, and Panju Maysum (2012) Analyzing Query
Optimization Process: Portraits of Join Enumeration Algorithms. In
proceedings of the 29th IEEE International Conference on Data Engineering
(ICDE), pages. 1301-1304.
9. Ozsu, M. Tamer, Valduriez Patrick (2010) Principles of Distributed Database
Systems. isbn: 978-1-4419-8833-1.
10. Majid Khan and M. N. A. Khan (2013) Exploring Query Optimization
Techniques in Relational Databases. In proceedings of the International Journal
of Database Theory and Application. Vol. 6, No. 3, June, 2013.
References
24. 24
12. Alaa Aljanaby, Emad Abuelrub and Mohammed Odeh (2005) A Survey of
Distributed Query Optimization. In proceedings of the the International Arab
Journal of Information Technology, Vol. 2, No. 1, January 2005.
13. B.M. Monjurul Alom, Frans Henskens and Michael Hannaford (2009) Query
Processing and Optimization in Distributed Database Systems. International
Journal of Computer Science and Network Security (IJCSNS), vol.9 No.9,
September 2009.
14. M.A. Pund, S. R. Jadhao, P. D. Thakare (2011) A Role of Query Optimization
in Relational Database. International Journal of Scientific & Engineering
Research, Volume 2, Issue 1, ISSN 2229-5518.
15. Ratnesh Litoriya, Anshu Ranjan (2010) Implementation of Relational Algebra
Interpreter using another query language. International Conference on Data
Storage and Data Engineering. Pages 24-28, ISBN: 978-0-7695-3958-4
References
25. 25
16. Kunal Jamsutkar, Viki Patil, Dr. B.B. Meshram (2013) Query Processing
Strategies in Distributed Database. Journal of Engineering, Computers &
Applied Sciences (JEC&AS), Volume 2, No.7, July 2013, ISSN No: 2319‐5606.
17. Kirby McMaster, Samuel Sambasivam, Steven Hadfield (2012) Relational
Algebra and SQL: Better Together. Proceedings of the Information Systems
Educators Conference. Vol. 29, no.1906, ISSN: 2167-1435.
18. Sunita M. Mahajan, Vaishali P. Jadhav (2012) An Analysis of Execution Plans in
Query Optimization. International Conference on Communication, Information &
Computing Technology (ICCICT). October 19-20, Mumbai, India.
19. W. Kim (1982) On optimizing an SQL-like nested query. ACM Transaction on
Database Systems., vol. 7, pages 443-469, Sept. 1982.
20. Deepak Sukheja,Umesh Kumar Singh (2011) A Novel Approach of Query
Optimization for Distributed Database Systems. IJCSI International Journal of
Computer Science Issues, Vol. 8, Issue 4, No 1, July 2011.
References
26. 26
21. Seema Parminder Kaur (2013) Query Optimization Algorithm based on
Relational Algebra Equivalence Transformation. International Journal of
Engineering and Management Sciences (I.J.E.M.S.), vol.4 (3) 2013: 326-331, ISSN
2229-600X .
22. E. Zafarani, M. Reza, H. Asil and A. Asil (2012) Presenting a New Method for
Optimizing Join Queries Processing in Heterogeneous Distributed Databases.
In proceeding of the Knowledge Discovery and Data Mining (WKDD ’10), pages
379 – 382, ISBN 978-1-4244-5397-9.
23. Jyoti Mor, Indu Kashyap, R.K.Rathy (2012) Implementing Semantic Query
Optimization in Relational Databases. International Journal of Computer
Applications, isbn: 0975 – 8887, vol 52, No.9, August 2012.
24. Stefano Ceri, Georg Gottlob (1985) Translating SQL Into Relational Algebra:
Optimization, Semantics, and Equivalence of SQL Queries, Software
Engineering, In proceeding of the IEEE Transactions, vol. SE-11, issue 4, pages.
324 – 345, April 1985.
25. J. Plodzien (2000) Optimization of Object query Language, Ph.D. Thesis,
Institute Of Computer Science, Polish Academy Of Science,2000.
References
27. 27
26. Preeti Tiwari, Swati V. Chande (2013) Optimization of Distributed Database
Queries Using Hybrids of Ant Colony Optimization Algorithm. International
Journal of Advanced Research in Computer Science and Software Engineering,
Volume 3, Issue 6, June 2013 ISSN: 2277 128X.
27. Navita Kumari “SQL Server Query Optimization Techniques - Tips for Writing
Efficient and Faster Queries” International Journal of Scientific and Research
Publications, Volume 2, Issue 6, June 2012 ISSN 2250-3153.
28. XU Silao, HONG Mei “Translating SQL Into Relational Algebra Tree Using
Object-Oriented Thinking to Obtain Expression Of Relational Algebra” I.J.
Engineering and Manufacturing, 2012,3, 53-62 Published Online June 2012 in
MECS.
29. Santhi Lasya, Sreekar Tanuku (2011) A Study of Library Databases by
Translating Those SQL Queries Into Relational Algebra and Generating
Query Trees. International Journal of Computer Science Issues (IJCSI), Vol.
8, Issue 5, No 1, September 2011.
References
29. Query Processing (SQL)
29
Structured Query Language (SQL) is declarative query language
developed for user convenience that tells the database management
system (DBMS) what the user wants.
Structure of SQL query is based on three clauses
SELECT column1, column2, . . . , columnn
FROM Table1, Table2, . . . , Tablen
WHERE condition
Introduction (Cont.)
30. 30
Employees
Example: Return names of employees who are programmers
SELECT Ename
FROM Employees
WHERE Employee . Title = ‘Programmer’
EID EName Title
E1 Ahmed Programmer
E2 Farhan Elect. Engineer
E3 Kashif Programmer
E4 Neelam Mech. Engineer
E5 Ehsan Syst. Analyst
Query Processing (SQL)
Introduction (Cont.)
31. 31
As opposed to SQL, relational algebra is procedural
programming language that not only tells the DBMS
what the user wants but also tells how to compute the
answer.
Relational algebra consists of operators that operate on
relations (Tables)
Relations (Tables) correspond to sets of tuples/records
Input of an operator:
one or two relations
Output of an operator:
a result relation
The output of one operator can serve as input to another
operator
Query Processing (Relational Algebra)
Introduction (Cont.)
σ, U, π,
×, ⋈,
∩, . . .
Table1 Table2
Result-Table
32. 32
Employees
Example: Return names of employees who are programmers
π EName σEmployees . Title = ‘Programmer’(Employees)
EID EName Title
E1 Ahmed Programmer
E2 Farhan Elect. Engineer
E3 Kashif Programmer
E4 Neelam Mech. Engineer
E5 Ehsan Syst. Analyst
Query Processing (Relational Algebra)
Introduction (Cont.)
33. 33
In order to translate a SQL query into relational algebra, query
processor translates;
SELECT clause to Projection (π),
FROM clause to table name(s) or their Cartesian, and
WHERE clause to Selection (σ)
Example 1: Return names of employees who are programmers
SELECT EName
FROM Employees
WHERE Employee . Title = ‘Programmer’
Query Processing (SQL to Relational Algebra)
Introduction (Cont.)
π EName σEmployees . Title = ‘Programmer’ (Employees)
34. 34
In order to translate a SQL query into relational algebra, query
processor translates;
SELECT clause to Projection (π),
FROM clause to table name(s) or their Cartesian, and
WHERE clause to Selection (σ)
Example 2: Return names of employees who are working on project P1
π EName σEmployees . EID = Assignment.ENo / Assignment . PNo = ‘P1’ (Employees X Assignment)
SELECT EName
FROM Employees, Assignment
WHERE Employee.EID = Assignment.ENo and Assignment.PNo = ‘P1’
Query Processing (SQL to Relational Algebra)
Introduction (Cont.)
35. 35
Name (Students . id = Enrolled . id / Enrolled . Grade = ‘B’ ( Students X Enrolled)
Name ( Enrolled . Grade = ‘B’ ( Students (Enrolled))⋈ Students . id = Enrolled . id
Name (Students (Enrolled . Grade = ‘B’ (Enrolled))⋈ Students . id = Enrolled . id
Results(Cont.)