Presentation on how to chat with PDF using ChatGPT code interpreter
Anton Dignös - Towards a Temporal PostgresSQL
1. Towards a Temporal PostgreSQL
Incorporating Primitives for Interval Processing into
PostgreSQL
Anton Dign¨s1
o
Michael B¨hlen1
o
Johann Gamper2
1 Department
of Computer Science
University of Z¨rich
u
2 Faculty of Computer Science
Free University of Bozen-Bolzano
SFScon13
sfscon 2013
1/20
Anton Dign¨s
o
2. Temporal Example
We have: Projects managed by departments
proj
Name
P1
P2
P3
sfscon 2013
Dept
M
PH
CS
Budg
10k
7k
5k
2/20
Start
Jan
Feb
Jun
End
Dec
Aug
Dec
Anton Dign¨s
o
3. Temporal Example
We have: Projects managed by departments
proj
Name
P1
P2
P3
Dept
M
PH
CS
Budg
10k
7k
5k
Start
Jan
Feb
Jun
End
Dec
Aug
Dec
Question: What are the top-2 time periods with most concurrent
projects?
sfscon 2013
2/20
Anton Dign¨s
o
4. Temporal Example
We have: Projects managed by departments
proj
Name
P1
P2
P3
Dept
M
PH
CS
Budg
10k
7k
5k
Start
Jan
Feb
Jun
End
Dec
Aug
Dec
Question: What are the top-2 time periods with most concurrent
projects?
Count
3
2
sfscon 2013
Start
Jun
Feb
2/20
End
Aug
Jun
Anton Dign¨s
o
5. Temporal Example
We have: Projects managed by departments
proj
Name
P1
P2
P3
Dept
M
PH
CS
Budg
10k
7k
5k
Start
Jan
Feb
Jun
End
Dec
Aug
Dec
Question: What are the top-2 time periods with most concurrent
projects?
Count
3
2
Start
Jun
Feb
End
Aug
Jun
Counting procedure: 1 @ Jan, 2 @ Feb, 2 @ Mar, . . .
sfscon 2013
2/20
Anton Dign¨s
o
6. Some Facts about our Work
4 years of intensive research work
1 year of my master (Free University of Bozen-Bolzano)
3 years of my Ph.D (University of Z¨rich)
u
Published in top-3 DB conferences with acceptance rate below 20%
Published and presented at SIGMOD’12 in Scottsdale, Arizona, USA
Demonstrated at ICDE’13 in Brisbane, Queensland, Australia
Widely adopted in the database community
Initially we have developed an SQL language extension
SQL extension was selected and proposed as amendment to the
ANSI/ISO standardization committee
SQL amendment was adapted and partially implemented by Teradata
sfscon 2013
3/20
Anton Dign¨s
o
7. Table of Content
Why Time?
The Temporal Database Field
Our Solution
Summary and Vision
sfscon 2013
4/20
Anton Dign¨s
o
8. Why Time? /1
Ubiquitous: All information is qualified with a time interval
medical records
loans
transport information
...
Gain: Additional Information
Prediction
Analysis
Strategy planning
Accountability
sfscon 2013
5/20
Anton Dign¨s
o
9. Why Time? /2
Projects with their department manager
Mgr
Ann
Sam
Ann
Joe
Dept
M
PH
CS
M
Name
P1
P2
P3
P1
Budg
10k
7k
5k
10k
Start
Jan
Feb
Jun
Jun
End
Jun
Aug
Dec
Dec
Additional Information:
Ann supervised P1 before Joe
Ann supervised two projects in total
Joe did not supervise the entire P1
There was a project P2 in the past supervised by Sam
...
sfscon 2013
6/20
Anton Dign¨s
o
10. The Temporal Database Field /1
Active research field since the 1980s
Many language proposals (TQuel, IXSQL, . . . )
Consensus language TSQL2 (1992)
SQL/Temporal official amendment of SQL3
TQuel
IXSQL
TempSQL
HSQL
...
SQL/TP
TSQL2
SQL/Temporal
ChronoLog
ChronoSQL
Teradata
ATSQL
statement modifiers
Lack of implementations and working solutions
sfscon 2013
7/20
Anton Dign¨s
o
11. The Temporal Database Field /2
Support for temporal data varies a lot depending on database vendor
(Order from most (1.) to least (5.) support)
1.
2.
3.
4.
5.
Teradata
Oracle DB
IBM DB2
PostgreSQL
Microsoft SQL Server
Time Infrastructure
Datatype
and
Functions
MS SQL Server
PostgreSQL
Oracle DB
IBM DB2
Time Travel
Time Processing
Temporal DB
SAP Hana
Teradata
PostgreSQL
sfscon 2013
8/20
Anton Dign¨s
o
12. The Temporal Database Field /2
Support for temporal data varies a lot depending on database vendor
(Order from most (1.) to least (5.) support)
1.
2.
3.
4.
5.
Teradata
Oracle DB
IBM DB2
PostgreSQL
Microsoft SQL Server
Time Infrastructure
Datatype
and
Functions
MS SQL Server
PostgreSQL
Oracle DB
IBM DB2
Time Travel
Time Processing
Temporal DB
SAP Hana
Teradata
PostgreSQL
Our goal is to advance PostgreSQL into a leading position
sfscon 2013
8/20
Anton Dign¨s
o
13. Microsoft SQL Server
Very limited support for time
Date datatypes and some functions
No support for intervals
David Lomet (MS Research)- Immortal DB1 (2002)
Transaction time support for SQL Server
Prototype
Workaround proposed by Itzik Ben-Gan et al.2 (2009)
(1) Transform intervals into points; (2) perform operations on
points; (3) transform points into intervals
Workaround is inefficient and does not consider intervals
1
http://research.microsoft.com/en-us/projects/immortaldb/
Itzik Ben-Gan et al., Inside Microsoft SQL Server 2008: T-SQL Programming,
Chap. 12 Temporal Support in the Relational Model, MSPress, 2009
2
sfscon 2013
9/20
Anton Dign¨s
o
14. PostgreSQL
Jeff Davis - Temporal Postgres3 (2007)
Interval datatype and UDF functions on intervals
Indexing via GiST index
PostgreSQL release 9.24 (2012)
Range Types
Indexing via GiST or SP-GiST index
Constraints on Ranges, i.e., temporal key constraints
No support for time travel, no support for temporal queries
3
4
http://temporal.projects.pgfoundry.org/
http://www.postgresql.org/docs/9.2/static/rangetypes.html
sfscon 2013
10/20
Anton Dign¨s
o
15. IBM DB2
Temporal extension added as of IBM DB2 10 for Z/OS5 (2010)
Support for time travel:
SYSTEM TIME AS OF
SYSTEM TIME FROM...TO...
SYSTEM TIME BETWEEN...AND...
Technology: Current and history tables
Support for time travel, no support for temporal queries
5
https://www.ibm.com/developerworks/data/library/techarticle/
dm-1204db2temporaldata/
sfscon 2013
11/20
Anton Dign¨s
o
16. Oracle DB
Temporal extension added via workspace manager as of Oracle DB
9i6 (2001)
Support for time travel:
WHERE AS OF
SetValidTime
Technology: Flashback
Support for time travel, no support for temporal queries
6
http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28396.pdf
sfscon 2013
12/20
Anton Dign¨s
o
17. Teradata
Temporal support added as of Teradata 13.107 (2010)
Currently DB with most support for time
Time travel similar to IBM DB2 and Oracle DB
Implements ANSI Temporal SQL (1992-1999)
Technology: Translation of queries at SQL level (Al-Kateb et al.
EDBT ’13)
Support of time travel, partial support for temporal queries
7
http://www.info.teradata.com/do_redirect.cfm?itemid=102320064
sfscon 2013
13/20
Anton Dign¨s
o
18. Our Solution - Splitting of Intervals
Same project data drawn on a timeline
P1, M, 10k
P2, PH, 7k
proj
P3, CS, 5k
Jan
Feb
Mar
Apr
May
Jun
Jul
2
1
Aug
Sep
Oct
Nov
Dec
t
2
3
Intervals in input and output are not the same
Requires splitting of intervals
sfscon 2013
14/20
Anton Dign¨s
o
19. Key Insight and Solution
Key Insight
Databases are not good at dealing with interval queries
Idea
Provide temporal primitives to split intervals
After splitting use traditional database operators with equality on
interval fragments
Solution
Two temporal primitives are required
Normalization N
Alignment φ
Reduction rules at algebraic level reduce temporal operations to
temporal primitives and traditional database operations
sfscon 2013
15/20
Anton Dign¨s
o
20. Reduction Rules
Blueprint for database programmers
Operator
Selection
T
σθ (r)
=
Reduction
σ θ (r)
Projection
π T (r)
B
=
π B,T (N r.B=s.B (r, r/s))
Aggregation
=
B,T ϑF (N r.B=s.B (r, r/s))
Difference
T
B ϑF (r)
r −T s
=
N r.A=s.A (r, s) − N r.A=s.A (s, r)
Union
r ∪T s
=
N r.A=s.A (r, s) ∪ N r.A=s.A (s, r)
Intersection
r ∩T s
=
N r.A=s.A (r, s) ∩ N r.A=s.A (s, r)
Cart. Prod.
r
×T s
=
α((φ (r, s))
Inner Join
r
Ts
θ
r d|><|T s
θ
r |><|dT s
θ
r d|><|dT s
θ
r Ts
θ
=
α((φθ (r, s))
=
α((φθ (r, s)) d|><| θ∧r.T =s.T (φθ (s, r)))
Left O. Join
Right O. Join
Full O. Join
Anti Join
Temporal Op.
sfscon 2013
r.T =s.T (φ
(s, r)))
θ∧r.T =s.T (φθ (s, r)))
=
α((φθ (r, s)) |><|d θ∧r.T =s.T (φθ (s, r)))
=
α((φθ (r, s)) d|><|d θ∧r.T =s.T (φθ (s, r)))
=
=
(φθ (r, s))
θ∧r.T =s.T (φθ (s, r))
Primitive + Traditional Op.
16/20
Anton Dign¨s
o
21. PostgreSQL Implementation /1
PostgreSQL prototype with implemented primitives available
http://www.ifi.uzh.ch/dbtg/research/align.html
SQL
Parser60kloc
150
Analyzer/Rewriter20kloc
450
Recovery
Manager
Lock
Manager
Optimizer50kloc
Executor40kloc
150
400
Files and Access Methods
Buffer Manager
Recovery
Manager
Disk Manager
DBMS8
Data and Index Files
8
Image: Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems. McGraw-Hill 2003
sfscon 2013
17/20
Anton Dign¨s
o
22. PostgreSQL Implementation /2
Implementation Approach:
Temporal primitives are implemented into query flow
Temporal primitives are nodes in query/plan/executor trees
Primitives themselves reuse traditional database operations
Only one new Executor function
sfscon 2013
18/20
Anton Dign¨s
o
23. PostgreSQL Implementation /2
Implementation Approach:
Temporal primitives are implemented into query flow
Temporal primitives are nodes in query/plan/executor trees
Primitives themselves reuse traditional database operations
Only one new Executor function
Advantages:
Temporal primitives are optimized within the plan tree
Cost estimation
(Join) order
Selection push-down
Propagate orderings
Traditional database operations are optimized out of the box
Local potential for performance improvements (work in progress . . . )
sfscon 2013
18/20
Anton Dign¨s
o
24. PostgreSQL Implementation /2
Implementation Approach:
Temporal primitives are implemented into query flow
Temporal primitives are nodes in query/plan/executor trees
Primitives themselves reuse traditional database operations
Only one new Executor function
Advantages:
Temporal primitives are optimized within the plan tree
Cost estimation
(Join) order
Selection push-down
Propagate orderings
Traditional database operations are optimized out of the box
Local potential for performance improvements (work in progress . . . )
sfscon 2013
18/20
Anton Dign¨s
o
25. PostgreSQL Implementation /2
Implementation Approach:
Temporal primitives are implemented into query flow
Temporal primitives are nodes in query/plan/executor trees
Primitives themselves reuse traditional database operations
Only one new Executor function
Advantages:
Temporal primitives are optimized within the plan tree
Cost estimation
(Join) order
Selection push-down
Propagate orderings
Traditional database operations are optimized out of the box
Local potential for performance improvements (work in progress . . . )
sfscon 2013
18/20
Anton Dign¨s
o
26. SQL Example
Query: What is the number of concurrent projects per department?
SELECT
Dept, COUNT(*)
FROM
proj
GROUP BY Dept
sfscon 2013
19/20
Anton Dign¨s
o
27. SQL Example
Query: What is the number of concurrent projects per department?
SELECT
Dept, COUNT(*)
FROM
proj
GROUP BY Dept
Operator
Aggregation
sfscon 2013
T
B ϑ F (r)
=
Reduction
B,T ϑF (N r.B=s.B (r, r/s))
19/20
Anton Dign¨s
o
28. SQL Example
Query: What is the number of concurrent projects per department?
SELECT
Dept, COUNT(*)
FROM
proj
GROUP BY Dept
Operator
Aggregation
T
B ϑ F (r)
=
Reduction
B,T ϑF (N r.B=s.B (r, r/s))
(proj NORMALIZE proj USING (Dept)) pnrom
sfscon 2013
19/20
Anton Dign¨s
o
29. SQL Example
Query: What is the number of concurrent projects per department?
SELECT
Dept, COUNT(*)
FROM
proj
GROUP BY Dept
Operator
Aggregation
T
B ϑ F (r)
=
Reduction
B,T ϑF (N r.B=s.B (r, r/s))
SELECT
Dept, COUNT(*), Start, End
FROM
(proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
sfscon 2013
19/20
Anton Dign¨s
o
30. SQL Example
Query: What is the number of concurrent projects per department?
SELECT
Dept, COUNT(*)
FROM
proj
GROUP BY Dept
Operator
Aggregation
T
B ϑ F (r)
=
Reduction
B,T ϑF (N r.B=s.B (r, r/s))
SELECT
Dept, COUNT(*), Start, End
FROM
(proj NORMALIZE proj USING (Dept)) pnrom
GROUP BY Dept, Start, End
Reduction rules are systematic and mechanic!
sfscon 2013
19/20
Anton Dign¨s
o
31. Summary and Vision
Currently
PostgreSQL prototype with implemented primitives available
http://www.ifi.uzh.ch/dbtg/research/align.html
Supports all temporal queries
Evaluation shows good performance
Working on additional index structures
sfscon 2013
20/20
Anton Dign¨s
o
32. Summary and Vision
Currently
PostgreSQL prototype with implemented primitives available
http://www.ifi.uzh.ch/dbtg/research/align.html
Supports all temporal queries
Evaluation shows good performance
Working on additional index structures
Vision
Integrate temporal primitives for temporal queries into the PostgreSQL
release
sfscon 2013
20/20
Anton Dign¨s
o
33. Summary and Vision
Currently
PostgreSQL prototype with implemented primitives available
http://www.ifi.uzh.ch/dbtg/research/align.html
Supports all temporal queries
Evaluation shows good performance
Working on additional index structures
Vision
Integrate temporal primitives for temporal queries into the PostgreSQL
release
Thank you for your attention!
sfscon 2013
20/20
Anton Dign¨s
o