Oracle Database 12c introduced several new features for application developers including multi-tenant architecture, in-memory database option, and improved flashback data archive capabilities. Flashback data archive now supports complete schema evolution, user context tracking for auditing changes, and import/export of historical data. JSON support was also added in 12c, allowing storage and querying of JSON documents within the database.
5. 5
License
• Flashback Data Archive in all database editions
6. Flashback
• Introduced in 9i
• Based on UNDO
• Initially only for recovery
• As of 11g – Total Recall option with
Flashback Data Archive
– Controlled history keeping per table
select ename
, job
, sal
from emp as of timestamp
(systimestamp - INTERVAL '1' DAY) yester_emp
7. Flashback
• Look back into history
– Query trends (version history)
– Difference reporting
– Restore Test Data Sets [to pre-test situation]
– Audit trails (Replace journaling tables)
• Require trick for transaction history: WHO?
• Also: when is the start of history? What went on before? What to do with
existing archives?
• By the way: Flashback Data Archive requires EE & Advanced
Compression database option
8. Total Recall
• Flashback Data Archive Improvements:
– Complete schema evolution support All table definition, partitioning,
and space management DDLs are supported on FDA-enabled
tables.
– The metadata information for tracking transactions including the
user context is now tracked. The addition of user-context tracking
makes it easier to determine which user made which changes to a
table.
• This could mean that journaling tables are now officially deprecated
• Also given that the current contents of journaling tables can even be migrated to
Flashback Data Archive
9. Total Recall (2)
• Import and export of history
– Support for import and export using Data Pump for FDA-enabled
tables. Data Pump can now be used to export and import an FDA-enabled
base table along with its schema-evolution metadata and
historical row versions.
• User generated history
– Support for importing user-generated history has been added.
Customers who have been maintaining history using other
mechanisms, such as triggers, can now import that history into Total
Recall.
• Database Hardening
– Register "Application" (a group of tables) and enable/disable
flashback data archive for the application (also available: lock an
application – make all tables read only)
10. 10
Generate History –
Actions by SYS
create table oow.emp as select * from scott.emp
grant execute on dbms_flashback_archive to oow;
grant execute on dbms_flashback to oow;
CREATE FLASHBACK ARCHIVE DEFAULT one_year TABLESPACE users
QUOTA 100M RETENTION 1 YEAR;
grant flashback archive on one_year to oow;
exec dbms_flashback_archive.create_temp_history_table('OOW', 'EMP');
-- This statement once in a database instance
--This will extend mappings to the past so that import of old
history can be done. Goes back to 01-JAN-88.
EXEC DBMS_FLASHBACK_ARCHIVE.extend_mappings();
---
-- after some history has been created:
EXEC DBMS_FLASHBACK_ARCHIVE.IMPORT_HISTORY('oow','EMP');
11. 11
Generate History –
Actions by Application
• Insert records describing each stage of history that has existed
– Including start and end time of historic state
insert into temp_history
(RID , STARTSCN , ENDSCN , XID, OPERATION
,EMPNO, ename, job, hiredate, sal, deptno )
values (NULL, timestamp_to_scn(to_date('01-04-2001', 'DD-MM-YYYY')),
timestamp_to_scn(to_date('01-07-2003', 'DD-MM-YYYY')), NULL, 'I'
,1567,'SELLERS','CLERK',to_date('01-04-2001','DD-MM-YYYY'),2200, 10);
insert into temp_history
(RID , STARTSCN , ENDSCN , XID, OPERATION
,EMPNO, ename, job, hiredate, sal, deptno)
values (NULL, timestamp_to_scn(to_date('01-07-2003', 'DD-MM-YYYY')),
timestamp_to_scn(to_date('01-10-2006', 'DD-MM-YYYY')), NULL, 'U'
,1567,'SELLERS','CLERK',to_date('01-04-2001','DD-MM-YYYY'),2200, 20);
…
12. 12
Query the generated history
select ename
, job
from emp as of timestamp (sysdate - INTERVAL '10' YEAR)
minus
select ename
, job
from emp
select ename
, job
from emp as of timestamp (systimestamp - INTERVAL '3' YEAR)
minus
select ename
, job
from emp
13. 13
Ensure transaction context is
recorded (and set)
exec dbms_flashback_archive.set_context_level(level=> 'ALL');
exec dbms_session.set_identifier('The Creepy User from Finance ');
update oow.emp
set sal = sal * 1.4
where ename = 'ALLEN'
/
commit;
exec dbms_session.set_identifier('Scary Janitor from the Annex');
update oow.emp
set sal = sal * 0.7
where ename = 'MILLER'
/
commit;
14. 14
Audit the generated history
SELECT versions_xid
, versions_starttime
, empno, ename, sal new_sal
, s.client_identifier
FROM oow.emp VERSIONS BETWEEN TIMESTAMP minvalue
AND maxvalue
, sys.sys_fba_context_aud s
where versions_xid = s.xid
15. 15
Alternative: retrieve context with
dbms_flashback_archive.get_sys_context
SELECT versions_xid
, versions_starttime
, empno, ename, sal new_sal
, dbms_flashback_archive.get_sys_context
(versions_xid,'USERENV','CLIENT_IDENTIFIER') who
FROM emp VERSIONS BETWEEN TIMESTAMP minvalue
AND maxvalue
16. JavaScript Object Notation
Lightweight data-interchange format
Support in Oracle Database 12c – 12.1.0.2
22. 22
JSON
• Light Weight, Structured, fairly tied coupled (low bandwidth) interactions
between application[component]s
• Very popular in rich client web applications and mobile apps – usually as
the format used in REST-services
• Oracle Database 12c is JSON aware
– Store documents (in a column VARCHAR2, CLOB, NCLOB, BLOB, RAW, BFILE –
and have them checked for JSON validity
alter table X
add CONSTRAINT ensure_json_chk CHECK (column_y IS JSON))
– Create Indexes on JSON contents
– Access JSON content directly from SQL queries
– Note: similar to but only a subset of XML support in Oracle Database
(as JSON in terms of functionality is a somewhat pale subset of XML)
23. 23
JSON in the Oracle Database
• Why JSON in relational database?
– Consistency (integrity, transaction ACIDity) for storing JSON documents
– Quickly interpret data received (and stored?) in JSON format
– Leverage JSON as a convenient way to handle structured data as a string
– Easy data transfer over some protocols, including HTTP – especially when database
engages directly in HTTP interactions
25. 25
Check for a valid JSON
document and/or path result
• IS JSON can be used in the WHERE clause to filter on rows that contain a
valid JSON document
– It can be used in CHECK Constraints for that same reason
select *
from customers
where doc IS JSON
• JSON_EXISTS is typically used in the WHERE clause to filter on rows
whose JSON content satisfies some JSON path condition
select *
from customers
where json_exists(doc, '$.addresses.privateAddress')
26. 26
JSON_VALUE to retrieve scalar
value from JSON document
• JSON_VALUE is a SQL Function used to retrieve a scalar value from a
JSON document
– JSON_VALUE can be used like any other SQL Function in queries and DML – in
various clauses like SELECT, WHERE, GROUP BY, etc.
select json_value
( '{"matches":
[ {"matchLineUp": "NED-ESP"
, "score": "5-2","matchDate":"13-06-2014"}
, {"matchLineUp": "SWI-FRA"
, "score": "2-5","matchDate":"20-06-2014"}
, {"matchLineUp": "GER-BRA"
, "score": "1-7","matchDate":"08-07-2014"}
]
}'
, '$.matches[1].score') as "2nd_match_score"
from dual
27. 27
JSON_QUERY to retrieve JSON-snippets
from a JSON document
• JSON_QUERY is a SQL Function used to retrieve a JSON snippet from a
JSON document - the result is a string that contains valid JSON
– Use WRAPPER to wrap an Array result in an object to return a valid document
select json_query
( '{"matches":
[ {"matchLineUp": "NED-ESP"
, "score": "5-2","matchDate":"13-06-2014"}
, {"matchLineUp": "SWI-FRA"
, "score": "2-5","matchDate":"20-06-2014"}
, {"matchLineUp": "GER-BRA"
, "score": "1-7","matchDate":"08-07-2014"}
]
}'
, $.matches[*].score' WITH WRAPPER) as scores
from dual
28. 28
JSON_TABLE to expose records
from a JSON document relationally
• JSON_TABLE is used in the FROM clause of SQL statements to project
data from a JSON document as a virtual relation view
– JSON_TABLE is very similar to XML_TABLE
with match_results as
( select '...' json_doc from dual)
select lineUp, score
, to_date(matchDate, 'DD-MM-YYYY') matchDate
from match_results
, json_table( json_doc, '$.matches[*]'
COLUMNS
( lineUp varchar2(20) PATH '$.matchLineUp'
, score varchar2(20) PATH '$.score'
, matchDate varchar2(20) PATH '$.matchDate'
)
)
29. 29
12.1.0.2 – JSON
What it does not do?
• Support for variable strings as JSON_PATH in JSON-functions
• Deal with JSON in PL/SQL
– JSON_VALUE and JSON_QUERY cannot be used directly from PL/SQL
• Construct JSON documents
– Not supported: conversion from ADT (to XMLType) to JSON type and vice versa
– Not supported: a SQL/JSON syntax similar to SQL/XML for querying together a
JSON document
– Not supported: facilities that inject JSON into RESTful Services implemented through
the PL/SQL Embedded Gateway
Types
OBJECTS,
NESTED TABLE XMLType
30. The PL/JSON library
• This library has been around for several years – and is still pretty much
relevant
• Especially useful
– For composing JSON documents
– And for converting back and forth from and to JSON to XMLType and ADT/UDT
32. In-line PL/SQL Functions and
procedures
WITH
procedure increment( operand in out number
, incsize in number)
is
begin
operand:= operand + incsize;
end;
FUNCTION inc(value number) RETURN number IS
l_value number(10):= value;
BEGIN
increment(l_value, 100);
RETURN l_value;
end;
SELECT inc(sal)
from emp
• Procedures are also allowed in-line
• In-Line Functions and Procedures can invoke each other
33. In-line PL/SQL Functions and
procedures
• In-Line Functions and Procedures can invoke each other
– And themselves (recursively)
WITH
FUNCTION inc(value number)
RETURN number IS
BEGIN
if value < 6000
then
return inc(value+100);
else
return value + 100;
END if;
end;
SELECT inc(sal)
from emp
34. Dynamic (PL/)SQL is allowed
inside inline functions
• EXECUTE IMMEDIATE can be used inside an inline PL/SQL function to
dynamically construct and execute SQL and PL/SQL
WITH
FUNCTION EMP_ENRICHER(operand varchar2) RETURN varchar2 IS
sql_stmt varchar2(500);
job varchar2(500);
BEGIN
sql_stmt := 'SELECT job FROM emp WHERE ename = :param';
EXECUTE IMMEDIATE sql_stmt INTO job USING operand;
RETURN ' has job '||job;
END;
SELECT ename || EMP_ENRICHER(ename)
from emp
Note: do not try this at home!
It is a horrible query!
(looks too much like POST_QUERY for comfort)
In-Line PL/SQL is not an excuse for lousy SQL
35. Combining in-line Views and
PL/SQL Functions & Procedures
WITH
procedure camel(p_string in out varchar2)
is
begin
p_string:= initcap(p_string);
end;
function obfuscate(p_string in varchar2)
return varchar2
is
l_string varchar2(100);
begin
l_string:= translate(upper(p_string), 'AEUIO','OIEUA');
camel(l_string);
return l_string;
end;
my_view as (
select obfuscate('Goedemorgen')
from dual
)
select *
from my_view
36. PL/SQL Functions That Run
Faster in SQL
• As of Oracle Database Release 12c, two kinds of PL/SQL functions
might run faster in SQL:
– PL/SQL functions that are defined in the WITH clauses of SQL SELECT
statements, described in Oracle Database SQL Language Reference
– PL/SQL functions that are defined with the "UDF Pragma"
• Pragma UDF means: compile in the ‘SQL way’ as to eliminate SQL
PL/SQL context switch
FUNCTION inc(string VARCHAR2)
RETURN VARCHAR2 IS
PRAGMA UDF;
value number(10):= to_number(string);
BEGIN
if value < 6000
then
return inc(value+100);
else
return to_char(value + 100);
end if;
end;
37. 38
• Maximum length for VARCHAR2 is
now 32KB (up from 4KB)
• Invisible Columns
• One unified audit trail
• PL/SQL DBMS_UTILITY.EXPAND_SQL_TEXT can be used to uncover the real SQL
executed for a given query
– Expressed only in the underlying base tables, including VPD policies
• New PL/SQL Package UTL_CALL_STACK provides API for inspecting the
PL/SQL Callstack
– Programmatic interpretation, not pretty like DBMS_ UTILITY.FORMAT_CALL_STACK
• Export View as Table with Data Pump – fine grained projection of columns and
rows that will be part of the Dump and subsequent Import
• Creation of multiple indexes on same set of columns is allowed
– Although only one can be live at any one time
• Cross PDB queries
38. DEFAULT
• Default applied (also) when NULL was explicitly specified
alter table emp
modify (sal number(10,2)
DEFAULT ON NULL 1000
)
• Default based on Sequence
alter table emp
modify (empno number(5) NOT NULL
DEFAULT ON NULL EMPNO_SEQ.NEXTVAL
)
• Identity Column that is automatically assigned generated sequence
number value
• Meta Data Only Defaults
– Data applies to potentially many records but hard takes up any space – only some
meta-data are required to declaratively describe the data
39. The Top-3 Earning Employees
• What can you say about the result of this query with respect to the
question: "Who are our top three earning employees?"
A. Correct Answer
B. Sometimes correct
C. Correct if there are never duplicate
salaries
D. Not Correct
41. TOP-N Queries in 12c
• Last part of a query to be evaluated – to fetch only selected rows from
the result set:
select *
from emp
order
by sal desc
FETCH FIRST 3 ROWS ONLY;
– To select the next set of rows:
select *
from emp
order
by sal desc
OFFSET 3 FETCH NEXT 4 ROWS ONLY;
42. Apply for joining
• APPLY is used to join with a Collection
SELECT *
FROM DEPT d
CROSS APPLY
employees_in_department(deptno) staff
• The function employees_in_department
returns a collection (TABLE OF
VARCHAR2 in this case)
• The function takes the DEPTNO value
from the DEPT records as input
• Only when the returned collection is not
empty will the DEPT record be produced
by this join
• Use OUTER APPLY to get a result row
even for an empty collection
D D D D
44. Data Redaction
• At runtime, you can optionally have the query results modified to
reset/scramble/randomize sensitive data
– Through ‘data redaction’ policies associated with tables and view and applied at
query time
POLICY
POLICY
RESULTS
SQL
engine SQL
• Because the data is masked in real-time, Data Redaction is well suited to
environments in which data is constantly changing.
• You can create the Data Redaction policies in one central location and
easily manage them from there.
45. My first Data redaction policy
• Access to DBMS_REDACT package
grant execute on dbms_redact to scott;
• Create Data Redaction Policy for SAL column in EMP table – hide salaries
from view
BEGIN
DBMS_REDACT.ADD_POLICY(
object_schema => 'scott',
object_name => 'emp',
column_name => 'sal',
policy_name => 'hide_salary',
function_type => DBMS_REDACT.FULL,
expression => '1=1' );
END;
• Find that querying EMP has changed forever…
– Note: the expression can be used to dynamically decide whether or not to apply the policy
46. Querying EMP with DATA
REDACTION in place
• Note: drop Redaction Policy
DBMS_REDACT.DROP_POLICY
( object_schema => 'scott'
, object_name => 'emp'
, policy_name => 'hide_salary'
);
48. SQL Statement Preprocessor
• A mechanism to allow the text of a SQL statement, submitted from a client
program (e.g. via ODBC or JDBC), to be translated by user-supplied code
before it is submitted to the Oracle Database SQL compiler.
– Additionally, this feature can satisfy any other use case where it is expedient to
intervene between the SQL statement that the client submits and what is actually
executed
– Some associatations: VPD policies, Data Redaction, MV query rewrite, Advanced
Query Rewrite, Temporal Validity (time slice)
Application
Pre SQL
proce
ssor
SQL
engine SQL
49. SQL Translation – set up
AS SYS:
grant create sql translation profile to oow;
grant alter session to oow;
AS Application Owner:
-- Create a Translation Profile
exec dbms_sql_translator.create_profile('HR_PROFILE');
BEGIN
DBMS_SQL_TRANSLATOR.REGISTER_SQL_TRANSLATION(
profile_name => 'HR_PROFILE',
sql_text => 'select ename, job, hiredate
from emp',
translated_text => 'select initcap(ename) as ename
, job, hiredate
from emp where job<>''MANAGER'' ‘
);
END;
select * FROM USER_SQL_TRANSLATION_PROFILES;
select * from USER_SQL_TRANSLATIONS;
50. SQL Translation – in action
-- to enable the profile (usually in logon trigger)
alter session set sql_translation_profile = HR_PROFILE
-- to pretend we are a foreign application, subject to
-- SQL Translation
alter session set events = '10601 trace name context forever, level
32';
-- execute query that is to be translated
select ename, job, hiredate
from emp
-- results are produced as if the translated text had been submitted
Pre SQL
proce
ssor
SQL
engine SQL
51. 53
SQL Translation
• Support for bind parameters
• Support for rewriting PL/SQL calls
• Translation of Error Messages (ORA-xxxx to SYB-yyyy or YOURAPP-zzz)
• Out of the box translators for Sybase, SQL Server and some DB2
• Optionally: register a custom translator package
– PROCEDURE TRANSLATE_SQL( sql_text IN CLOB, translated_text OUT CLOB);
52. SQL Statement Preprocessor
• The translation code is named and is installed in the database using a
PL/SQL API. It can be implemented programmatically, or by look-up, or by
a suitable mixture of these.
• The mechanism also allows Oracle error codes and American National
Standards Institute (ANSI) SQLSTATES to be translated by user-supplied
code.
• The motivating use case is to allow extant client-side application code,
written for a different vendor's database (and therefore for a SQL dialect
other than Oracle's), to run unchanged against an Oracle Database by
emulating the syntax and semantics of the other SQL dialect thereby
greatly reducing the cost of migration.
• Additionally, this feature can satisfy any other use case where it is
expedient to intervene between the SQL statement that the client
submits and what is actually executed.
• See: Oracle Database SQL Translation Installation, Configuration, and
User's Guide for details
53. 55
Looking into the future…
OUR_PRODUCTS
NAME PRICE
select name, price
from our_products
54. 56
Looking further into the
future…
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
, query_time => TO_TIMESTAMP('01-10-2018', 'DD-MM-YYYY')
);
OUR_PRODUCTS
NAME PRICE
begin
level => 'ASOF'
end;
select name, price
from our_products
55. 57
Current situation …
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
OUR_PRODUCTS
NAME PRICE
begin
level => 'CURRENT'
);
end;
select name, price
from our_products
56. 58
All data in the table
(the default setting)
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
OUR_PRODUCTS
NAME PRICE
begin
level => 'ALL'
);
end;
select name, price
from our_products
57. 59
All data in the table
(the default setting)
OUR_PRODUCTS
select name, price, start_date, end_date
from our_products
order
by start_date
NAME PRICE
START_DATE END_DATE
begin
DBMS_FLASHBACK_ARCHIVE.ENABLE_AT_VALID_TIME (
level => 'ALL'
);
end;
58. Make the database aware of the time
based business validity of records
• Add timestamp columns indicating start and end of valid time for a record
• Specify a PERIOD for the table
create table our_products
( name varchar2(100)
, price number(7,2)
, start_date timestamp
, end_date timestamp
, PERIOD FOR offer_time (start_date, end_date)
);
• Note:
– A table can have multiple sets of columns, describing multiple types of validness
– Beyond 12.1.0.2 many Temporal Validity enhancements are expected:
• Unique Constraints, Foreign Key references, (auto) Join conditions, gap and overlap checks, aggregation
59. Valid time aware
flashback queries
• Select all product prices on offer at a certain moment in time
SELECT *
FROM OUR_PRODUCTS AS OF PERIOD FOR offer_time
TO_TIMESTAMP('01-10-2014','DD-MM-YYYY')
• Perform all queries for records that are valid at a certain point in time –
past or future
EXECUTE DBMS_FLASHBACK_ARCHIVE.enable_at_valid_time
( 'ASOF'
, TO_TIMESTAMP('01-05-2016','DD-MM-YYYY')
);
• Return all records currently (session time) valid
EXECUTE DBMS_FLASHBACK_ARCHIVE.enable_at_valid_time('CURRENT');
• Return all records (default)
EXECUTE DBMS_FLASHBACK_ARCHIVE.enable_at_valid_time('ALL');
61. White List
• A white list of allowed invokers can be defined for a PL/SQL unit
– supports the robust implementation of a module, consisting of a main unit and
helper units, by allowing the helper units to be inaccessible from anywhere
except the unit they are intended to help.
62. accessible by clause
_______
package Helper authid Definer accessible by (Good_Guy, Bad_Guy)
is
procedure p;
end Helper;
package body Good_Guy is
procedure p is
begin
Helper.p();
...
end p;
end Good_Guy;
package body Bad_Guy is
procedure p is
begin
Helper.p();
...
end p;
end Bad_Guy;
PLS-00904: insufficient privilege to access object HELPER
63. 67
(More) Security related
features
• Attach database roles to the program units functions, procedures,
packages, and types.
– The role then becomes enabled during execution of the program unit (but not during
compilation of the program unit).
• A new built-in namespace, SYS_SESSION_ROLES, allows you to
determine if a specified role is enabled for the querying user
• View is either
– BEQUEATH DEFINER (the default), which behaves like a Definer’s Rights unit
(functions in the view are executed using the view owner’s rights)
– or BEQUEATH CURRENT_USER, which behaves somewhat like an invoker’s rights
unit (functions in the view are executed using the current user’s rights)
• An Invoker's Rights Function Can Be Result-Cached
• READ privilege to allow SELECT but no SELECT FOR UPDATE
• Invoker’s rights procedure calls only can run with the privileges of the
invoker if the procedure’s owner has the INHERIT PRIVILEGES privilege
on the invoker (do not stealthily use invoker’s privileges as owner)
– or if the procedure’s owner has the INHERIT ANY PRIVILEGES privilege
64. Inherit or not in Invoker rights
program units
• When a user runs an invoker's rights procedure (or program unit), it runs
with the privileges of the invoking user.
• As the procedure runs, the procedure’s owner temporarily has access to
the invoking user's privileges.
• [If the procedure owner has fewer privileges than an invoking user,] the
procedure owner could use the invoking user’s privileges to perform
operations
Invoker’s rights
procedure procedure
Special_
owner
invoker
Table
Tap_
Table
66. 70
Raw Data Refinement
based on pattern matching
14,0
16,1
14,1
16,1
16,0
13,1
14,0
16,0
13,1
13,0
14,1
16,0
14,1
13,0
14,1
16,0
13,1
14,0
Processing
• Information
• Conclusion
• Alert
• Recommendation
• Action
67. 71
Raw, fine grained
Tennis Results: rally points
Challenge:
• Derive the match winner and final
score (per set and for the entire
match) from this raw data
Match Id, Player [who scored]
14,0
16,1
14,1
16,1
16,0
13,1
14,0
68. 72
Using MATCH_RECOGNIZE to
process data looking for patterns
• MATCH_RECOGNIZE will analyze, filter, aggregate and reformat the data
– based on matches patterns between subsequent rows in the source
rows
with rallypoints as
( select column_value player , rownum seq
from table(number_tbl(1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0,0))
)
SELECT winner, gameNo
FROM rallypoints
MATCH_RECOGNIZE
(
...
) MR
source
row set MATCH_RECOGNIZE
Result Set fed to
SELECT
69. 73
Using MATCH_RECOGNIZE to
process data looking for patterns
with rallypoints as
( select column_value player , rownum seq
from table(number_tbl(1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0,0))
)
SELECT winner, gameNo
FROM rallypoints
MATCH_RECOGNIZE
(
ORDER BY seq
MEASURES C.seq AS seq, C.player as winner
, MATCH_NUMBER() AS gameNo
ONE ROW PER MATCH
PATTERN (A+? C)
DEFINE
C as (case C.player
when 1 then sum(A.player) else (sum(abs((A.player-1))))
end >= 3
and
case C.player
when 1 then sum(A.player*2-1) else (sum(1-A.player*2))
end >= 1
)
) MR
- The first player to have
won more than 4 points
- and have won two or more
points more than his
opponent
71. 75
Using MATCH_RECOGNIZE to
process data looking for patterns
with rallypoints as
( select column_value player , rownum seq
from table(number_tbl(1,0,0,1,0,0,1,1,1,0,1,1,1,0,0,0,0, …))
)
, gamepoints as
(SELECT winner, gameNo
FROM rallypoints
MATCH_RECOGNIZE
() MR
)
, setpoints as
(SELECT winner, gameNo
FROM gamepoints
MATCH_RECOGNIZE
() MR
)
, matchpoints as
(SELECT winner, gameNo
FROM setpoints
MATCH_RECOGNIZE
() MR
)
select *
from matchpoints
72. Who is afraid of Red, Yellow
and blue
• Table Events
– Column Seq number(5)
– Column Payload varchar2(200)
73. Solution using Lead
• With LEAD it is easy to compare a row with its successor(s)
– As long as the pattern is fixed, LEAD will suffice
with look_ahead_events as
( SELECT e.*
, lead(payload) over (order by seq) next_color
, lead(payload,2) over (order by seq) second_next_color
FROM events e
)
select seq
from look_ahead_events
where payload ='red'
and next_color ='yellow'
and second_next_color='blue'
74. Find the pattern red, yellow and
blue
• Using the new 12c Match Recognize operator for finding patterns in
relational data
SELECT *
FROM events
MATCH_RECOGNIZE
(
ORDER BY seq
MEASURES RED.seq AS redseq
, MATCH_NUMBER() AS match_num
ALL ROWS PER MATCH
PATTERN (RED YELLOW BLUE)
DEFINE
RED AS RED.payload ='red',
YELLOW AS YELLOW.payload ='yellow',
BLUE AS BLUE.payload ='blue'
) MR
ORDER
BY MR.redseq
, MR.seq;
75. Match_recognize for finding
patterns in relational data
• The expression MATCH_RECOGNIZE provides native SQL support to
find patterns in sequences of rows
Table
Source
&
Where
Match_
Recognize
Process
and Filter
Select &
Order By
• Match_recognize returns Measures for selected (pattern matched) rows
– Similar to MODEL clause
• Match Conditions are expressed in columns from the Table Source,
aggregate functions and pattern functions FIRST, PREV, NEXT, LAST
• Patterns are regular expressions using match conditions to express a
special sequence of rows satisfying the conditions
76. Did we ever consecutively hire
three employees in the same job?
• Find a string of three subsequent hires where each hire has the same job
• Order by hiredate, pattern is two records that each have the same job as
their predecessor
SELECT *
FROM EMP
MATCH_RECOGNIZE
(
ORDER BY hiredate
MEASURES SAME_JOB.hiredate AS hireday
, MATCH_NUMBER() AS match_num
ALL ROWS PER MATCH
PATTERN (SAME_JOB{3})
DEFINE
SAME_JOB AS SAME_JOB.job = FIRST(SAME_JOB.job)
) MR
77. Pattern clause is a regular
expression
• Supported operators for the pattern clause include:
– * for 0 or more iterations
– + for 1 or more iterations
– ? for 0 or 1 iterations
– { n } for exactly n iterations (n > 0)
– { n, } for n or more iterations (n >= 0)
– { n, m } for between n and m (inclusive) iterations (0 <= n <= m, 0 < m)
– { , m } for between 0 and m (inclusive) iterations (m > 0)
– reluctant qualifiers - *?, +?, ??, {n}?, {n,}?, { n, m }?, {,m}?
– | for alternation (OR)
– grouping using () parentheses
– exclusion using {- and -}
– empty pattern using ()
– ^ and $ for start and end of a partition
78. Find the longest sequence of
related observations
• Records are ordered
• Each record is qualified: assigned
to a certain category
• Examples:
– Voting records
– Ball possession in football
– Days with or without rain
– Passing vehicles (make and model
or category)
– DNA records
• The challenge: find the longest string
of consecutive observations in
the same category
79. Find the longest sequence of
related observations
SELECT section_category
, section_start
FROM observations
MATCH_RECOGNIZE
(
ORDER BY seq
MEASURES SAME_CATEGORY.category as section_category
, FIRST(SAME_CATEGORY.seq) as section_start
ONE ROW PER MATCH
PATTERN (SAME_CATEGORY* DIFFERENT_CATEGORY) -- as many times as possible
DEFINE
SAME_CATEGORY AS SAME_CATEGORY.category = FIRST(SAME_CATEGORY.category)
, DIFFERENT_CATEGORY AS DIFFERENT_CATEGORY.category !=
NEXT(DIFFERENT_CATEGORY.category)
) MR
order
by rows_in_section desc
)
80. Suppose we allow a single
interruption of a sequence
• One record with a different category
will not end the sequence – it might after
all be a fluke or an incident
• Rewrite the pattern match
to also accept one entry
with a different category
ONE ROW PER MATCH
AFTER MATCH SKIP TO NEXT ROW
-- a next row in the current match may be start of a next string
PATTERN (SAME_CATEGORY* DIFFERENT_CATEGORY{0,1} SAME_CATEGORY* )
DEFINE
SAME_CATEGORY AS SAME_CATEGORY.category = FIRST(SAME_CATEGORY.category)
, DIFFERENT_CATEGORY AS DIFFERENT_CATEGORY.category !=
SAME_CATEGORY.category
81. Find sequence (with one accepted
interruption) from all records
SELECT substr(section_category,1,1) cat
, section_start
, seq
FROM observations
MATCH_RECOGNIZE
( ORDER BY seq
MEASURES SAME_CATEGORY.category as section_category
, FIRST(SAME_CATEGORY.seq) as section_start
, seq as seq
ONE ROW PER MATCH
AFTER MATCH SKIP TO NEXT ROW -- a next row in the current match may be
-- start of a next string
PATTERN (SAME_CATEGORY* DIFFERENT_CATEGORY{0,1} SAME_CATEGORY* )
DEFINE
SAME_CATEGORY AS SAME_CATEGORY.category = FIRST(SAME_CATEGORY.category)
, DIFFERENT_CATEGORY AS DIFFERENT_CATEGORY.category !=
SAME_CATEGORY.category
) MR
order
by rows_in_section desc
82. Suspicious transactions
• Find occurrences of three or more money transfers
(> 10k) within 24 hours – not necessarily consecutive
Account Transfer
Timestamp
Amount
To_Account
Account_Number
Holder_Name
83. Suspicious transactions
select *
from transfers t
MATCH_RECOGNIZE
(
PARTITION BY from_account
ORDER BY transfer_time
MEASURES MATCH_NUMBER() AS match_num
, sum(amount) as total_amount
, classifier() as match_role
ALL ROWS PER MATCH
PATTERN (SUSPICIOUS_TRANSFER NORMAL_TRANSFER* SUSPICIOUS_TRANSFER
NORMAL_TRANSFER* SUSPICIOUS_TRANSFER)
DEFINE SUSPICIOUS_TRANSFER as SUSPICIOUS_TRANSFER.amount > 10000
and SUSPICIOUS_TRANSFER.transfer_time <
FIRST(SUSPICIOUS_TRANSFER.transfer_time + INTERVAL '24' HOUR)
, NORMAL_TRANSFER as NORMAL_TRANSFER.amount <= 10000
) MR
84. No Corner cutting
Registration
Registration
Point
CheckPoint_Label
Distance_from_Start
Runner_Id
Timestamp
85. No Corner cutting
• Patterns to look for:
– Checkpoints passed in the wrong order or checkpoints missed altogether
– Suspicious accelerations – stretch with > 20% higher average speed than prior or
later stretches
• Additional analysis
– Fastest stretch by anyone
– Section that is the fastest section for most runners
• Because of downhill or favorable wind
– Top 3 runners over any selected section
Registration
Registration
Point
CheckPoint_Label
Distance_from_Start
Runner_Id
Timestamp
86. Find suspicious speeds…
• When a runner increases speed by more than 20% - something irregular
may be going on…
match_recognize (
partition by runner_id
order by id
ALL ROWS PER MATCH
PATTERN (SUSPICIOUS_SPEED+ )
DEFINE SUSPICIOUS_SPEED as
(distance_from_start- PREV(distance_from_start)/
( extract ( hour from registration_time) + (extract(minute from
registration_time)/60) + (extract(second from registration_time)/3600)
- extract ( hour from PREV(registration_time)) +
(extract(minute from PREV(registration_time))/60) + (extract(second from
PREV(registration_time))/3600)
)) > 1.2 *
(PREV(distance_from_start)- PREV(distance_from_start,2)/
( extract ( hour from PREV(registration_time)) +
(extract(minute from PREV(registration_time))/60) …
- extract ( hour from PREV(registration_time,2)) +
(extract(minute from PREV(registration_time,2))/60) + …
))
) MR
87. Find suspicious speeds…
(the easier LAG based solution)
with runner_data as
( select r.*
, leg_distance/( extract ( hour from leg_time)
+ (extract(minute from leg_time)/60)
+ (extract(second from leg_time)/3600)
) leg_speed
from ( select r.*, rp.*
, rp.distance_from_start - lag(rp.distance_from_start,1,0) over
(partition by runner_id order by rp.id) leg_distance
, r.registration_time –
lag(r.registration_time,1,INTERVAL '0' MINUTE)
over (partition by runner_id order by rp.id) leg_time
from registrations r join registration_points rp
on (rp.id = checkpoint_id)
) r
)
, runner_data_now_and_previous as
( select runner_data.*
, lag(leg_speed) over ( partition by runner_id
order by checkpoint_id) previous_speed
from runner_data
)
select *
from runner_data_now_and_previous
where leg_speed > 1.2 * previous_speed
88. 95
Summary
Security
Scripts can be downloaded from https://github.com/lucasjellema/OracleDatabase12c-development-demonstration