SlideShare a Scribd company logo
1 of 18
Download to read offline
1Confidential – Oracle Restricted
<Insert Picture Here>
Balancing Act to improve RDF Query Performance
in Oracle Database
Eugene I. Chong
3
Agenda
• RDF Query processing Issues
• RDF Order-By and Filter Processing
• RDF In-Memory Processing
• RDF In-Memory Virtual Columns
• Conclusion
Confidential – Oracle Restricted
4
Oracle RDF
• RDF_LINK$ table (triples)
– normalized
– subject, predicate, object IDs
• RDF_VALUE$ table (ID to value mapping)
– value, type, etc.
• Issues
– frequent joins with RDF_VALUE$ table to present results,
process filters and order-by queries
– complete de-normalization incurs large storage requirements
– self-joins: large intermediate join results
5
Oracle RDF Filters and Order-By Processing
• SPARQL order-by semantics
– order: no values, blank nodes, IRIs, literals
– case statement: value type, numeric value, date value, string
value
– ORDER BY CASE WHEN (V4.VALUE_TYPE IS NULL)
THEN 0
WHEN (V4.VALUE_TYPE IN ('BLN','BN')) THEN 1
WHEN (V4.VALUE_TYPE IN ('URI','UR')) THEN 2
WHEN (V4.VALUE_TYPE IN ('PL', 'PLL', 'CPLL', 'PL@',
'PLL@', 'CPLL@', 'TL', 'TLL', 'CTLL', 'LIT'))
THEN (CASE WHEN (V4.LANGUAGE_TYPE IS NOT NULL)
THEN 5
……..
6
Oracle RDF Filters and Order-By Processing
– literal type - numeric: TO_NUMBER( )
– literal type - date/time: TO_TIMESTAMP_TZ ( ), DECODE( )
– use function calls to generate SQL for order-by
– case statements executed for every row at runtime
– same problem for filters
• Solution
– materialize value type and values in RDF_VALUE$ table
– stored as ORDER_TYPE, ORDER_NUM, ORDER_DATE
– filled in at load time
– generate SQL: ORDER BY order_type, order_num,
order_date, value_name
– filter clause: WHERE order_num < to_number(89)
7
Oracle RDF Order-By and Filter Performance using
BSBM Benchmark Queries (in secs)
0
10
20
30
40
50
60
70
80
90
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 BI1 BI2 BI3 BI4 BI5 BI6 BI7 BI8
Without Order Columns With Order Columns
8
Oracle RDF In-Memory Processing
• Utilize Oracle IMC
– load frequently accessed columns in memory
• RDF_LINK$ table: subject, predicate, object IDs
• RDF_VALUE$: id, value
– fast full scan of the table: good for hash join
• Experiment
– 32GB memory, 2TB disk space
– LUBM benchmark queries (8,763,829 rows including
entailment)
– varying the size of the memory: 6G(100%), 4G(56%),
2G(27%), 1G(12%)
9
Oracle RDF In-Memory Query Times (in sec) for
LUBM Benchmark Queries
• 100% : 4x – 6x gain
• 56%
0
10
20
30
40
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14
No IM
IM (100%)
0
10
20
30
40
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14
No IM
IM (56%)
10
Oracle RDF In-Memory Query Times (in sec) for
LUBM Benchmark Queries
• 27%
• 12%
0
10
20
30
40
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14
No IM
IM (27%)
0
10
20
30
40
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14
No IM
IM (12%)
11
Oracle RDF In-Memory Full Scan Performance
(in sec)
• Fetching 3 IDs from RDF_LINK$ table
• 100% - 190x gain
0
0.5
1
1.5
2
2.5
IM(100%) IM(56%) IM(27%) IM(12%)
NoIM
IM
12
Oracle RDF In-Memory Virtual Columns
• In-memory complete de-normalization without
incurring disk storage requirements
– define virtual columns in RDF_LINK$ table for values, types,
etc. : VALUE_NAME_S, VALUE_NAME_P,
VALUE_NAME_O, etc.
– useful for fully populated data in memory: virtual model
Virtual column in-memory performance (in min) –fetching 3 IDs & 3 VCs
0
5
10
15
20
25
IM (100%) IM (56%) IM (27%) IM (12%)
No IM
IM-No VC
IM-VC
13
Oracle RDF In-Memory Virtual Columns
– remove joins with RDF_VALUE$ table
– queries are processed on RDF_LINK$ table only
– compression, smart scans (in-memory storage index),
dictionary code for values, SIMD vector processing
Confidential – Oracle Restricted
14
Oracle RDF In-memory Virtual Column Performance
using LUBM Benchmark Queries (in secs)
• Up to 8x gain
• As the number of joins increases, a bigger gain is achievable
0
10
20
30
40
50
60
70
80
90
100
Q9 (3 joins) Q2 (2 joins) Q6 (2 joins) Q13 (1 join) Q14 (1 join)
No IMVC IMVC
15
Oracle RDF In-Memory Virtual Columns
• Can apply to data mart/data warehousing star/
snowflake schema
– remove joins with dimension tables
• Can apply to any applications where joined tables
have one-to-one mapping on their join keys
Confidential – Oracle Restricted
16
Conclusion
• Significant performance improvement
– use order columns in place of complex logic in the query for
RDF filter and order-by processing
– improve hash joins by in-memory processing of frequently
accessed columns
– remove costly joins using in-memory virtual columns by
complete de-normalization for fully populated data
17Confidential – Oracle Restricted
<Insert Picture Here>
Your Questions
18

More Related Content

What's hot

ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013
Owen O'Malley
 

What's hot (20)

Vertica the convertro way
Vertica   the convertro wayVertica   the convertro way
Vertica the convertro way
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysis
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Vertica mpp columnar dbms
Vertica mpp columnar dbmsVertica mpp columnar dbms
Vertica mpp columnar dbms
 
Vertica on aws
Vertica on awsVertica on aws
Vertica on aws
 
Dynamo db
Dynamo dbDynamo db
Dynamo db
 
Spark SQL Bucketing at Facebook
 Spark SQL Bucketing at Facebook Spark SQL Bucketing at Facebook
Spark SQL Bucketing at Facebook
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
 
AWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDBAWS July Webinar Series - Getting Started with Amazon DynamoDB
AWS July Webinar Series - Getting Started with Amazon DynamoDB
 
ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013ORC File and Vectorization - Hadoop Summit 2013
ORC File and Vectorization - Hadoop Summit 2013
 
Introduction to aws dynamo db
Introduction to aws dynamo dbIntroduction to aws dynamo db
Introduction to aws dynamo db
 
Amazon DynamoDB Workshop
Amazon DynamoDB WorkshopAmazon DynamoDB Workshop
Amazon DynamoDB Workshop
 
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing ShuffleBucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
Bucketing 2.0: Improve Spark SQL Performance by Removing Shuffle
 
Deep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDBDeep Dive: Amazon DynamoDB
Deep Dive: Amazon DynamoDB
 
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst OptimizerDeep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
 
Amazon DynamoDB Lessen's Learned by Beginner
Amazon DynamoDB Lessen's Learned by BeginnerAmazon DynamoDB Lessen's Learned by Beginner
Amazon DynamoDB Lessen's Learned by Beginner
 
Amazon Athena, w/ benchmark against Redshift - Pop-up Loft TLV 2017
Amazon Athena, w/ benchmark against Redshift - Pop-up Loft TLV 2017Amazon Athena, w/ benchmark against Redshift - Pop-up Loft TLV 2017
Amazon Athena, w/ benchmark against Redshift - Pop-up Loft TLV 2017
 
Sql query performance analysis
Sql query performance analysisSql query performance analysis
Sql query performance analysis
 
Lessons from the Field, Episode II: Applying Best Practices to Your Apache S...
 Lessons from the Field, Episode II: Applying Best Practices to Your Apache S... Lessons from the Field, Episode II: Applying Best Practices to Your Apache S...
Lessons from the Field, Episode II: Applying Best Practices to Your Apache S...
 
Revealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine DataRevealing the Power of Legacy Machine Data
Revealing the Power of Legacy Machine Data
 

Viewers also liked

eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
Gianmarco Rossi
 
BBC_Trust_FourNationsImpartiality2015
BBC_Trust_FourNationsImpartiality2015BBC_Trust_FourNationsImpartiality2015
BBC_Trust_FourNationsImpartiality2015
Sophie Puet
 

Viewers also liked (16)

Photovoltaic (PV) LED Lightning system
Photovoltaic (PV) LED Lightning systemPhotovoltaic (PV) LED Lightning system
Photovoltaic (PV) LED Lightning system
 
eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
eni_Rossi Gianmarco - Energy Management System for the Optimization of the Up...
 
Cartilha escolas-seguras
Cartilha  escolas-segurasCartilha  escolas-seguras
Cartilha escolas-seguras
 
BBC_Trust_FourNationsImpartiality2015
BBC_Trust_FourNationsImpartiality2015BBC_Trust_FourNationsImpartiality2015
BBC_Trust_FourNationsImpartiality2015
 
Bloque IV
Bloque IVBloque IV
Bloque IV
 
Dasar2 Pengukuran For Student
Dasar2 Pengukuran For StudentDasar2 Pengukuran For Student
Dasar2 Pengukuran For Student
 
2015 Sep - NLP Workshop - NLP Center
2015 Sep - NLP Workshop - NLP Center2015 Sep - NLP Workshop - NLP Center
2015 Sep - NLP Workshop - NLP Center
 
May the Marketing Hub be with you - Nicholas Christensen, Marketo
May the Marketing Hub be with you - Nicholas Christensen, MarketoMay the Marketing Hub be with you - Nicholas Christensen, Marketo
May the Marketing Hub be with you - Nicholas Christensen, Marketo
 
Google apps for education 소개
Google apps for education 소개Google apps for education 소개
Google apps for education 소개
 
Tunisie actualite ecotourisme
Tunisie actualite  ecotourismeTunisie actualite  ecotourisme
Tunisie actualite ecotourisme
 
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
8th TUC Meeting - Peter Boncz (CWI). Query Language Task Force status
 
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
8th TUC Meeting – George Fletcher (TU Eindhoven), gMark: Schema-driven data a...
 
Uitsec Brochure
Uitsec BrochureUitsec Brochure
Uitsec Brochure
 
أساليب التفكير وعلاقتها بتقدير الذات في ضوء متغير ي الجنس و التخصص دراسة مقار...
أساليب التفكير وعلاقتها بتقدير الذات في ضوء متغير ي الجنس و التخصص دراسة مقار...أساليب التفكير وعلاقتها بتقدير الذات في ضوء متغير ي الجنس و التخصص دراسة مقار...
أساليب التفكير وعلاقتها بتقدير الذات في ضوء متغير ي الجنس و التخصص دراسة مقار...
 
Selling Skills For New Med Reps
Selling Skills For New Med RepsSelling Skills For New Med Reps
Selling Skills For New Med Reps
 
10 Easy Productive Things to Do to Increase Creativity
10 Easy Productive Things to Do to Increase Creativity10 Easy Productive Things to Do to Increase Creativity
10 Easy Productive Things to Do to Increase Creativity
 

Similar to 8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF Query Performance in Oracle Database.

2008 2086 Gangler
2008 2086 Gangler2008 2086 Gangler
2008 2086 Gangler
Secure-24
 

Similar to 8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF Query Performance in Oracle Database. (20)

Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
 
2008 2086 Gangler
2008 2086 Gangler2008 2086 Gangler
2008 2086 Gangler
 
Redshift Chartio Event Presentation
Redshift Chartio Event PresentationRedshift Chartio Event Presentation
Redshift Chartio Event Presentation
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
SQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - AdvancedSQL Server 2014 Memory Optimised Tables - Advanced
SQL Server 2014 Memory Optimised Tables - Advanced
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Nosql data models
Nosql data modelsNosql data models
Nosql data models
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Tunning overview
Tunning overviewTunning overview
Tunning overview
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and SparkCassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
Cassandra Summit 2014: Interactive OLAP Queries using Apache Cassandra and Spark
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Oracle performance tuning online training
Oracle performance tuning online training Oracle performance tuning online training
Oracle performance tuning online training
 
Spark sql
Spark sqlSpark sql
Spark sql
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Sqlite
SqliteSqlite
Sqlite
 
SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.SQL Now! How Optiq brings the best of SQL to NoSQL data.
SQL Now! How Optiq brings the best of SQL to NoSQL data.
 
Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons          Using Apache Spark as ETL engine. Pros and Cons
Using Apache Spark as ETL engine. Pros and Cons
 
Harnessing the power of both worlds
Harnessing the power of both worldsHarnessing the power of both worlds
Harnessing the power of both worlds
 

More from LDBC council

More from LDBC council (20)

8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
8th TUC Meeting - Juan Sequeda (Capsenta). Integrating Data using Graphs and ...
 
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...8th TUC Meeting -  Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
8th TUC Meeting - Zhe Wu (Oracle USA). Bridging RDF Graph and Property Graph...
 
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
8th TUC Meeting – Yinglong Xia (Huawei), Big Graph Analytics Engine
 
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
8th TUC Meeting – Marcus Paradies (SAP) Social Network Benchmark
 
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
8th TUC Meeting - Sergey Edunov (Facebook). Generating realistic trillion-edg...
 
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
Weining Qian (ECNU). On Statistical Characteristics of Real-Life Knowledge Gr...
 
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
8th TUC Meeting | Lijun Chang (University of New South Wales). Efficient Subg...
 
8th TUC Meeting -
8th TUC Meeting - 8th TUC Meeting -
8th TUC Meeting -
 
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
8th TUC Meeting - David Meibusch, Nathan Hawes (Oracle Labs Australia). Frapp...
 
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
8th TUC Meeting - Martin Zand University of Rochester Clinical and Translatio...
 
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
8th TUC Meeting - Tim Hegeman (TU Delft). Social Network Benchmark, Analytics...
 
LDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status updateLDBC 8th TUC Meeting: Introduction and status update
LDBC 8th TUC Meeting: Introduction and status update
 
LDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusionsLDBC 6th TUC Meeting conclusions
LDBC 6th TUC Meeting conclusions
 
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOXParallel and incremental materialisation of RDF/DATALOG in RDFOX
Parallel and incremental materialisation of RDF/DATALOG in RDFOX
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
E-Commerce and Graph-driven Applications: Experiences and Optimizations while...
 
LDBC SNB Benchmark Auditing
LDBC SNB Benchmark AuditingLDBC SNB Benchmark Auditing
LDBC SNB Benchmark Auditing
 
Social Network Benchmark Interactive Workload
Social Network Benchmark Interactive WorkloadSocial Network Benchmark Interactive Workload
Social Network Benchmark Interactive Workload
 
MarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesMarkLogic Overview and Use Cases
MarkLogic Overview and Use Cases
 
Towards Temporal Graph Management and Analytics
Towards Temporal Graph Management and AnalyticsTowards Temporal Graph Management and Analytics
Towards Temporal Graph Management and Analytics
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

8th TUC Meeting - Eugene I. Chong (Oracle USA). Balancing Act to improve RDF Query Performance in Oracle Database.

  • 2. <Insert Picture Here> Balancing Act to improve RDF Query Performance in Oracle Database Eugene I. Chong
  • 3. 3 Agenda • RDF Query processing Issues • RDF Order-By and Filter Processing • RDF In-Memory Processing • RDF In-Memory Virtual Columns • Conclusion Confidential – Oracle Restricted
  • 4. 4 Oracle RDF • RDF_LINK$ table (triples) – normalized – subject, predicate, object IDs • RDF_VALUE$ table (ID to value mapping) – value, type, etc. • Issues – frequent joins with RDF_VALUE$ table to present results, process filters and order-by queries – complete de-normalization incurs large storage requirements – self-joins: large intermediate join results
  • 5. 5 Oracle RDF Filters and Order-By Processing • SPARQL order-by semantics – order: no values, blank nodes, IRIs, literals – case statement: value type, numeric value, date value, string value – ORDER BY CASE WHEN (V4.VALUE_TYPE IS NULL) THEN 0 WHEN (V4.VALUE_TYPE IN ('BLN','BN')) THEN 1 WHEN (V4.VALUE_TYPE IN ('URI','UR')) THEN 2 WHEN (V4.VALUE_TYPE IN ('PL', 'PLL', 'CPLL', 'PL@', 'PLL@', 'CPLL@', 'TL', 'TLL', 'CTLL', 'LIT')) THEN (CASE WHEN (V4.LANGUAGE_TYPE IS NOT NULL) THEN 5 ……..
  • 6. 6 Oracle RDF Filters and Order-By Processing – literal type - numeric: TO_NUMBER( ) – literal type - date/time: TO_TIMESTAMP_TZ ( ), DECODE( ) – use function calls to generate SQL for order-by – case statements executed for every row at runtime – same problem for filters • Solution – materialize value type and values in RDF_VALUE$ table – stored as ORDER_TYPE, ORDER_NUM, ORDER_DATE – filled in at load time – generate SQL: ORDER BY order_type, order_num, order_date, value_name – filter clause: WHERE order_num < to_number(89)
  • 7. 7 Oracle RDF Order-By and Filter Performance using BSBM Benchmark Queries (in secs) 0 10 20 30 40 50 60 70 80 90 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 BI1 BI2 BI3 BI4 BI5 BI6 BI7 BI8 Without Order Columns With Order Columns
  • 8. 8 Oracle RDF In-Memory Processing • Utilize Oracle IMC – load frequently accessed columns in memory • RDF_LINK$ table: subject, predicate, object IDs • RDF_VALUE$: id, value – fast full scan of the table: good for hash join • Experiment – 32GB memory, 2TB disk space – LUBM benchmark queries (8,763,829 rows including entailment) – varying the size of the memory: 6G(100%), 4G(56%), 2G(27%), 1G(12%)
  • 9. 9 Oracle RDF In-Memory Query Times (in sec) for LUBM Benchmark Queries • 100% : 4x – 6x gain • 56% 0 10 20 30 40 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 No IM IM (100%) 0 10 20 30 40 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 No IM IM (56%)
  • 10. 10 Oracle RDF In-Memory Query Times (in sec) for LUBM Benchmark Queries • 27% • 12% 0 10 20 30 40 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 No IM IM (27%) 0 10 20 30 40 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 No IM IM (12%)
  • 11. 11 Oracle RDF In-Memory Full Scan Performance (in sec) • Fetching 3 IDs from RDF_LINK$ table • 100% - 190x gain 0 0.5 1 1.5 2 2.5 IM(100%) IM(56%) IM(27%) IM(12%) NoIM IM
  • 12. 12 Oracle RDF In-Memory Virtual Columns • In-memory complete de-normalization without incurring disk storage requirements – define virtual columns in RDF_LINK$ table for values, types, etc. : VALUE_NAME_S, VALUE_NAME_P, VALUE_NAME_O, etc. – useful for fully populated data in memory: virtual model Virtual column in-memory performance (in min) –fetching 3 IDs & 3 VCs 0 5 10 15 20 25 IM (100%) IM (56%) IM (27%) IM (12%) No IM IM-No VC IM-VC
  • 13. 13 Oracle RDF In-Memory Virtual Columns – remove joins with RDF_VALUE$ table – queries are processed on RDF_LINK$ table only – compression, smart scans (in-memory storage index), dictionary code for values, SIMD vector processing Confidential – Oracle Restricted
  • 14. 14 Oracle RDF In-memory Virtual Column Performance using LUBM Benchmark Queries (in secs) • Up to 8x gain • As the number of joins increases, a bigger gain is achievable 0 10 20 30 40 50 60 70 80 90 100 Q9 (3 joins) Q2 (2 joins) Q6 (2 joins) Q13 (1 join) Q14 (1 join) No IMVC IMVC
  • 15. 15 Oracle RDF In-Memory Virtual Columns • Can apply to data mart/data warehousing star/ snowflake schema – remove joins with dimension tables • Can apply to any applications where joined tables have one-to-one mapping on their join keys Confidential – Oracle Restricted
  • 16. 16 Conclusion • Significant performance improvement – use order columns in place of complex logic in the query for RDF filter and order-by processing – improve hash joins by in-memory processing of frequently accessed columns – remove costly joins using in-memory virtual columns by complete de-normalization for fully populated data
  • 17. 17Confidential – Oracle Restricted <Insert Picture Here> Your Questions
  • 18. 18