2. Mauro Pagano
• Consultant, working with both DBA and Devs
• Oracle Enkitec Accenture
• DBPerf and SQL Tuning
• Training, Workshops, OUG
• Free Tools (SQLd360, TUNAs360, Pathfinder)
2
3. Some background
• CBO makes mistakes
– Complex code with challenging job
– Based on
• Statistical formula, not perfect in 100% cases
• Partial knowledge of the data (stats)
• Mistakes can translate into poor plans
• Poor plans usually lead to poor performance
3
4. How to avoid mistakes?
• Improve quality of the model (Oracle)
– Hard, lots of corner cases for CBO to handle
• Improve quality of stats (kind of you)
– Hard, need knowledge of data and queries
• Reactively, learn from them to avoid next time
– Allow to adapt to specific situation
– Requires mistake to be made first though
4
5. Cardinality Feedback
• Oracle attempt of making CBO learning from its own mistakes
– Introduced in 11.2 (11.1 as one-off)
– Enhanced in 12.1 to deal with joins (but disabled in 12.2)
• Might take a bit to learn
– Kind of iterative approach, up to 5 “refined” attempts
• Lessons learned as OPT_ESTIMATE hints (like STA), very specific
– Not persisted to dictionary (same mistakes over time)
• Few initial bugs gave it a bad reputation
– SQL runs fine the first time, run again and runs slow!
– Supposed to be transparent, not good when gives troubles
– Some people got burned, decided to turn off system-wise
5
6. Table TAB1
drop table tab1;
create table tab1 (n1 number, n2 number, n3 number);
-- the data is strongly correlated
insert into tab1
select mod(rownum, 100),
mod(rownum, 100),
mod(rownum, 100)
from dual
connect by rownum <= 100000;
commit;
exec dbms_stats.gather_table_stats(user,'TAB1');
6
8. SQL Plan Directives
• New feature introduced in 12.1
– Several changes introduced in 12.2
• Designed to be transparent
– SPDs are CBO generated
– SPDs are leveraged by CBO once in place
• Enabled by default in 12.1
• Disabled by default in 12.2
8
9. What’s the goal?
• Help CBO make better estimations
• Not SQL-specific
– Can be re-used by similar SQLs
• No specific adjustments like CFB
• Keep track of ”risky paths”
• Determine when to leverage other features
– Dynamic Statistics
– Instruct DBMS_STATS to create Column Groups
9
10. What does SPD track?
• Cardinality misestimates (usually underestimates)
• “Risky paths” are
– Data correlation on multi-column filter predicates
– Uncommon join correlation
– Data correlation on columns in GROUP BY
• Potentially other paths can be tracked
– For example, cardinality estimation at query block level
• Many TYPEs could be supported
– As of today, only existing TYPE is DYNAMIC_SAMPLING (*)
10
11. How does SPD track them?
• Entry in the data dictionary
– opt_directive$, opt_finding_obj$
– Externalized as DBA views
• Records table / columns involved
– Different SQLs on same objects can use same SPD
• Info read at hard parse and consumed accordingly
– For example, triggering dynamic sampling
• Used as warning “watch out when estimating X”
11
12. SPD creation example
select count(*) from tab1 where n1 = 1 and n2 = 1 and n3 = 1;
(from 10053)
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for TAB1[TAB1]
Table: TAB1 Alias: TAB1
Card: Original: 100000.000000 Rounded: 1 Computed: 0.100000 Non Adjusted: 0.10
…
SPD: Generating finding id: type = 1, reason = 1, objcnt = 1, obItr = 0, objid = 135446, objtyp = 1, vecsize = 4, colvec =
[1, 2, 3, ], fid = 29…48
SPD: Inserted felem, fid=29…48, ftype = 1, freason = 1, dtype = 0, dstate = 0, dflag = 0,…
12
Mistake!
We know it’s 1K rows
13. SPD creation example
select directive_id, type, reason, notes
from dba_sql_plan_directives
where directive_id in (select directive_id
from dba_sql_plan_dir_objects
where owner = sys_context('userenv', 'session_user')
and object_name = 'TAB1');
DIRECTIVE_ID TYPE REASON
-------------------------- ---------------- ------------------------------------
1149064327055029475 DYNAMIC_SAMPLING SINGLE TABLE CARDINALITY MISESTIMATE
NOTES
-----------------------------------------------------
<spd_note>
<internal_state>NEW</internal_state>
<redundant>NO</redundant>
<spd_text>{EC(MPAGANO.TAB1)[N1, N2, N3]}</spd_text>
</spd_note>
13
14. SPD creation example
select *
from dba_sql_plan_dir_objects
where owner = sys_context('userenv', 'session_user')
and object_name = 'TAB1';
SUBOBJECT_NAME OBJECT
---------------- ------
N1 COLUMN
N2 COLUMN
N3 COLUMN
TABLE NOTES is at the TABLE level
NOTES
--------------------------------------------------------------------------
<obj_note>
<equality_predicates_only>YES</equality_predicates_only>
<simple_column_predicates_only>YES</simple_column_predicates_only>
<index_access_by_join_predicates>NO</index_access_by_join_predicates>
<filter_on_joining_object>NO</filter_on_joining_object>
</obj_note>
14
15. SPD creation (another) example
DIRECTIVE_ID TYPE REASON
--------------------- ---------------- ------------------------------------
7426975260533728013 DYNAMIC_SAMPLING JOIN CARDINALITY MISESTIMATE
NOTES
---------------------------------------------------------------------------------
<spd_note>
<internal_state>NEW</internal_state>
<redundant>NO</redundant>
<spd_text>{(SYS.OPT_DIRECTIVE$) - (SYS.OPT_FINDING_OBJ$)}</spd_text>
</spd_note>
DIRECTIVE_ID OBJECT_NAME OBJECT NOTES
--------------------- ---------------- ------ ------------------------------------
7426975260533728013 OPT_DIRECTIVE$ TABLE <obj_note> EVERYTHING NO </obj_note>
7426975260533728013 OPT_FINDING_OBJ$ TABLE <obj_note> EVERYTHING NO </obj_note>
15
16. SPD creation example
• SPD_TEXT reports the “risky path”
– Specific format within { }, built on the fly
– One or more tables
– Columns included for single table
– Format includes flag on predicate (where applicable)
• E = Equality predicates
• C = Simple column predicates
• J = Index access by join predicates
• F = Filter on joining object
16
17. SPD creation example
• INTERNAL_STATE shows where in the lifecycle SPD is
– NEW – SPD just created (flushed to disk), not used by SQLs yet
– MISSING_STATS – CBO used to help estimations, likely used ADS
– HAS_STATS – Stats (likely Column Group) helped CBO, SPD “off”
– PERMANENT – Stats not enough, SPD will stay “on” (ADS)
• First two states are transitory
• Last two states are definitive
– HAS_STATS -> PERMANENT if stats don’t help “enough”
– SPD goes trough HAS_STATS before PERMANENT
17
18. How does CBO use SPD?
• During hard parse applicable SPDs are looked at
– Search done at multiple levels
– Objects looked up by obj#
• Many SPDs are not “current” anymore
– Obsoleted / Superseded
• The valid ones are used
– Action triggered at the specific level
– As of today only CBO action is Dynamic Sampling (ADS)
18
19. SPD usage example – single table
select count(*) from tab1 where n1 = 1 and n2 = 1 and n3 = 1;
Query Block SEL$1 (#0)
Applicable DS directives:
dirid = 16004857199080262683, state = 1, flags = 1, loc = 1 {EC(135484)[1, 2, 3]}
…
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for TAB1[TAB1]
SPD: Directive valid: dirid = 16004857199080262683, state = 1, flags = 1, loc = 1 {EC(135484)[1, 2, 3]}
…
Table: TAB1 Alias: TAB1
Card: Original: 100000.000000 >> Single Tab Card adjusted from 0.100000 to 1000.000000 due to adaptive
dynamic sampling
Rounded: 1000 Computed: 1000.000000 Non Adjusted: 0.100000
19
20. SPD usage example – single table
select count(*) from tab1 where n1 = 1 and n2 = 1 and n3 = 1;
(from 10046 during hard parse)
PARSING IN CURSOR #140175694530408 …
SELECT /* DS_SVC */ /*+ … result_cache(snapshot=3600) */ SUM(C1)
FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "TAB1")*/ 1 AS C1
FROM "TAB1" "TAB1"
WHERE ("TAB1"."N1"=1)
AND ("TAB1"."N2"=1)
AND ("TAB1"."N3"=1)
) InnerQuery
STAT #140175694530408 id=1 cnt=1 op='RESULT CACHE 145cmpkvg18at78z0dz8ptfwfw STAT
#140175694530408 id=2 cnt=1 op='SORT AGGREGATE
STAT #140175694530408 id=3 cnt=1000 op='TABLE ACCESS FULL TAB1
20
21. Use of SPD – single table conclusions
• Allows CBO to get better estimation on the fly
• Cool but not the best approach in the long run
– ADS even if fast still takes some time (overhead)
– Longer parses cause most of the troubles in real-life
• Same result achieved with Column Group (often)
• Real goal is
– Put a band-aid until Column Group in place
– Instruct DBMS_STATS to collect Column Groups
– Verify if Column Group helped, if not keep SPD active
21
23. SPD usage example – join
(from 10053)
SPD: Directive valid: dirid = 7426975260533728013,…,loc = 2 {(612)[]; (608)[]}
…
Join Card: 13.000000 = outer (9.000000) * inner (13.000000) * sel (0.111111)
>> Join Card adjusted from 13.000000 to 16.000000 due to adaptive dynamic sampling, prelen=2
Adjusted Join Cards: adjRatio=1.230769 cardHjSmj=16.000000 …
Join Card - Rounded: 16 Computed: 16.000000
(from 10046)
PARSING IN CURSOR #139704599064112
SELECT /* DS_SVC */ /*+ … result_cache(snapshot=3600) */ SUM(C1)
FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "F#0") */ 1 AS C1
FROM "SYS"."OPT_FINDING$" "F#0",
"SYS"."OPT_DIRECTIVE$" "D#1"
WHERE ("D#1"."F_ID"="F#0"."F_ID")
) innerQuery
23
24. Use of SPD – join conclusions
• Allows CBO to get better estimations for joins
– Provide adjustment ratio
– No stored stats provide this info (really cool)
– OPT_ESTIMATE is the closest thing, but still different
• Suffers from same side effects of single table
– Magnified by longer ADS on joins
– SPD from joins can be SUPERSEDED (HAS_STATS) too (*)
24
25. What happens to SPD after usage?
(from 10053 of the second parse)
SPD: Generating finding id: type = 1, reason = 1, objcnt = 1, obItr = 0, objid = 135484, objtyp = 1,
vecsize = 4, colvec = [1, 2, 3, ], fid = 108…890
SPD: Inserted felem, fid=108…890,ftype = 0,freason = 0,dtype = 1,dstate = 2,…
SPD: qosdRecDSDirChange dirid = 16004857199080262683, retCode = UPDATED
(from dictionary)
NOTES
----------------------------------------------------
<spd_note>
<internal_state>MISSING_STATS</internal_state>
<redundant>NO</redundant>
<spd_text>{EC(MPAGANO.TAB1)[N1, N2, N3]}</spd_text>
</spd_note>
25
CBO is aware there is a
“weak path” and it had to
use ADS in order to limit
the damage
26. SPD instructs DBMS_STATS
(from 10046 during parse)
merge into sys.col_group_usage$ d using (select :1 obj#, :2 cols from dual) s on (d.obj# = s.obj# and d.cols = s.cols)
when matched then update set d.timestamp = :3, d.flags = d.flags + :4 - bitand(d.flags, :4) when not matched then insert
(obj#, cols, timestamp, flags) values (:1,:2,:3,:4)
:1 -> 135484 :2 -> “1,2,3” :3 -> ”1/12/2017 8:33:7” :4 -> 17
(from 10046 during DBMS_STATS run)
SELECT ... FROM SYS.COL_GROUP_USAGE$ CU
…
UPDATE SYS.COL_GROUP_USAGE$ C
SET C.FLAGS = FLAGS + 8 - BITAND(FLAGS, 8)
WHERE OBJ# = :B2 AND COLS = :B1
COLUMN_NAME DATA_DEFAULT NUM_DISTINCT HISTOGRAM
------------------------------ ------------------------------------ ------------ ---------
SYS_STSOYQUEIAZ7FI9DV53VLN$$$0 SYS_OP_COMBINED_HASH("N1","N2","N3") 100 FREQUENCY
26
27. Then what happens to SPD?
SINGLE TABLE ACCESS PATH
Single Table Cardinality Estimation for TAB1[TAB1]
SPD: Directive valid: dirid=160…83,state=2,flags=1,loc=1{EC(…)[1, 2, 3]}
...
ColGroup (#1, VC) SYS_STSOYQUEIAZ7FI9DV53VLN$$$0
Col#: 1 2 3 …
Table: TAB1 Alias: TAB1
Card: Original: 100000.000000 Rounded: 1000 Computed: 1000.000000
…
SPD: Inserted felem, fid=10…890,…,dtype = 1, dstate = 3, dflag = 1,…
…
NOTES
------------------------------------------------------
<spd_note>
<internal_state>HAS_STATS</internal_state>
<redundant>NO</redundant>
<spd_text>{EC(MPAGANO.TAB1)[N1, N2, N3]}</spd_text>
</spd_note>
27
If the CG doesn’t help then
SPD is marked as
PERMANENT and behaves like
MISSING_STATS (ADS used)
No mistake
28. SPD DS results
• DS results are stored
– Multiple SQL IDs (different parse) would trigger same DS SQL
• In 12.1 Result Cache is used
– Results have lifetime of 3600 secs
– Result Cache itself had some troubles in the past
• Mostly latch contention under heavy load
• In 12.2 SPD repository itself used to store results
– Makes them more permanent
– Search in the repo based on ADS recursive SQL ID
28
29. Is my SQL using SPD?
• Many SPDs can reference just a few objects
– Even simple SQL can use many SPDs
• CBO looks into them and pick up valid
– A subset of Valid are Used
• OTHER_XML reports #valid/#used
• DBMS_XPLAN to show if SQL is using SPDs
– DISPLAY (explain plan) reports which ones
– DISPLAY_CURSOR (execution) reports just #used
29
30. DBMS_SPD
• Oracle seeded package to manage SPDs
• Move via pack/unpack in stgtab like baselines
– Base 12.1.0.2 has bugs that impact xfr of SPDs
• Allow to manually flush SPDs to disk
– Instead of waiting 15 minutes
• Can set basic preferences
– For example, retention period before auto drop
30
31. Why the bad reputation?
• Some issues caused large negative impact
– Hard to spot when the feature is “transparent”
– Main symptoms mutex / latch contention
• SPDs generated based on workload
– Performance tends get better “with time”
• Makes performance in Prod bad on day 1
• Even though it was great in Test on day 200
– Upgrades need extra steps (move SPDs)
• Limited number of Column Groups per table
– Less useful CG could be created instead of useful ones
31
32. Most common problems
• Most issues are ultimately caused by ADS
– ADS running for too long
– ADS executed even for small app SQLs (no need)
• Longer parse caused by sampling
– Parse is a serialized operation, parser hold X mutex
– Other sessions with same SQL are stuck
– Easy for snowball effect to trigger
• Result Cache maybe not the best decision?
32
33. Solutions?
• I’m personally a big fan of SPD
– But I’m not a Production DBA
• Many products require to turn it off in 12.1
– Bad habit was to turn off all Adaptive Features via parameter
– Oracle patch split parameter in two (AdaptPlans vs AdaptStats)
• So just SPD and ADS can be disabled, Adaptive Plans stay ON
• Oracle recommended solution in 12.1, SPD OFF with patch+param
• Disabled by default in 12.2
– When enabled it seems a bit more conservative
– Too early to judge how it will behave in 12.2 prod
33
34. Summary
• Goal is improve CBO estimations over time
– Tracking correlation in many areas
– Putting stats / ADS in place to catch that
• Might take time and introduce instability
– Makes comparing two systems harder
• Generated issues in 12.1
– Oracle recommends to disable by default
– Enable back specific systems might be beneficial
34
Not an easy task for Oracle to improve the code, the complex you make the model the easier it is to break stuff and not cover all ground
Quality of the stats is easier to improve but it’s on you, not much on Oracle
Reactively it’s the easiest one, but it takes a mistake to be made to realize it was a mistake
Mention a little about tab1 having strong correlation, DDL will come later
Presentation focused on the feature itself, differences between 12.1 and 12.2 presented as we go through the details
If disabled in 12.2, why are we even talking about it?
Reason is SPD can help a lot, so knowing what they are can help to realize when to turn it back on
SPD is kind of built on top of CFB experience, slightly different goal but ironing out things that could have been done “maybe differently” in CFB
Main difference is CFB is SQL-ID (cursor-based), while SPD is “object” based, track relationship across entities
The TYPE is kind of the action to take (which feature to use) based on the finding
In 12.2 we have DYNAMIC_SAMPLING_RESULT too
It’s like taking notes during a parse and “similar” parse can re-use those notes
SPD here is used to track a potential mistake made when filtering on (n1,n2,n3) together.
We don’t know it yet, we just “keep track” of it
qosdGenFindId / qosdDumpSgaFind
SPD: Generating finding id: type = 1, reason = 1, objcnt = 1, obItr = 0, objid = 135443, objtyp = 1, vecsize = 4, colvec = [1, 2, 3, ], fid = 3950674910938887825
SPD: Inserted felem, fid=3950674910938887825, ftype = 1, freason = 1, dtype = 0, dstate = 0, dflag = 0, ver = NO, keep = NO
(gdb) bt
#0 0x000000000218d550 in qosdGenFindId ()
#1 0x000000000218d172 in qosdRecFindInCFB ()
#2 0x00000000017453e3 in kkocfbNodeAllocated ()
#3 0x0000000001fc60c3 in qknstAllocate ()
#4 0x0000000001fb5689 in qkatab ()
#5 0x0000000001f85459 in qkajoi ()
#6 0x0000000001f97b0f in qkaqkn ()
#7 0x0000000001f91049 in qkadrv ()
…
Doesn’t matter how many times the SQL is executed, until the SPD is flushed to disk even a new hard parse (for example, because of different CBO environment) wouldn’t use it
From a 10053 for two consecutive executions with different CBO env
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
SPD: Generating finding id: type = 1, reason = 1, objcnt = 1, obItr = 0, objid = 135450, objtyp = 1, vecsize = 4, colvec = [1, 2, 3, ], fid = 17828064754844457670
SPD: Inserted felem, fid=17828064754844457670, ftype = 1, freason = 1, dtype = 0, dstate = 0, dflag = 0, ver = NO, keep = NO
SPD: Return code in qosdDSDirSetup: NOCTX, estType = TABLE
SPD: Generating finding id: type = 1, reason = 1, objcnt = 1, obItr = 0, objid = 135450, objtyp = 1, vecsize = 4, colvec = [1, 2, 3, ], fid = 17828064754844457670
SPD: Modified felem, fid=17828064754844457670, ftype = 1, freason = 1, dtype = 0, dstate = 0, dflag = 0, ver = YES, keep = YES
SPD for join generated by “SELECT DIRECTIVE_ID FROM DBA_SQL_PLAN_DIRECTIVES” triggered recursively by DBMS_SPD.DROP_SQL_PLAN_DIRECTIVE
SPD reference only tables, no column
It “exposes” the relationship between the two objects
select directive_id, xt.text
from dba_sql_plan_directives s,
xmltable ('/spd_note' passing s.notes columns text varchar2(100) PATH 'spd_text') xt
where xt.text like'{F%';
F means there was a filter on a column for one of the objects involved in this join SPD
Not the join key itself, another column (basically filtering out part of one of the two tables)
Focus on status “alone” first, then (next slides) on transition
Bug 20311655 (that will probably be closed as dup of another bug, which is closed as not a bug)
Multiple level since SPD supports multiple levels (single table, join, QB, etc)
CBO identifies there is a SPD in place for obj 135484 col ids 1,2,3
Used at the right level (SINGLE TABLE ACCESS PATH)
Mention that parse is a serialized operation so everything that increases parse time affect system scalability
select a.directive_id, a.state, a.reason, a.created, b.object_name, a.notes
from dba_sql_plan_directives a, dba_sql_plan_dir_objects b
where a.directive_id = b.directive_id and a.created >= sysdate-1/24 order by a.create
Here the SPD is used during Join order[] evaluation, different section of the code
Oracle here comes up with adjRatio instead of fixed number, used because you can have different joins (including semi, etc etc)
(*) even though likely just temporary (unless an identical one one is created), bug 20311655 is an example (this bug is a duplicate of something else, close as not a bug)
The HAS_STATS here comes from CFB, which makes the SPD look like ”superseded”, until the next parse comes and CFB info aren’t around, then SPD becomes PERMANENT