SlideShare a Scribd company logo
1 of 31
Download to read offline
Copyright © 2020, Oracle and/or its affiliates
Loading, Indexing and
Searching for Recommendations
with Text and JSON
Roger Ford
Principal Product Manager
Text & JSON
Starts at 11am ET
2 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date]
Roger Ford
Principal Product Manager
Roger Ford is a principle product manager
at Oracle Corporation and has been with the
company since version 5, in 1987. Roger
manages a portfolio of products including
Oracle Text, JSON in the database and the
Database Scheduler. Currently he is
spending most of his time planning the
release of a new Autonomous JSON
product.
About your presenter:
When not at work, Roger is usually found
working on, or racing his Caterham sportscar.
He helps manage the Caterham Graduates
Racing Club, one of the biggest car racing
clubs in the world.
3 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date]
The case for Full Text Indexing
• "Keyword" searching is a familiar concept from internet search engines and web stores
• Yet many companies have massive amounts of valuable information in document formats which are
currently not searchable
• Office documents: MS Word, PDF, Powerpoint
• Email archives
• "How to" documents and training materials
• The database can index these documents and provide full-text content searching
• No need for a separate search engine
• Combine relational and content searching via SQL
• Full text search is available in all versions and editions of the database, and is not a chargable option
Copyright © 2019 Oracle and/or its affiliates.
The case for JSON development
• Traditional SQL development requires pre-defined schema
• Schema changes can be major task
• Limited number of columns available (1000 in 19c)
• JSON development increasingly popular
• Schema-less, agile development style
• Schema evolution is far less costly
• Non-relational data model : hierarchical document/object model
• Data access via many language API
• Key/Value document collection API using QBE
• Textual values in JSON can be full-text indexed
Copyright © 2019 Oracle and/or its affiliates.
Full Text Indexes
• For text in VARCHAR2 and CLOB columns, eg. a CLOB column called mycol:
create index myind on mytab(mycol) indextype is ctxsys.context;
• Query it with
select * from mytab where contains (mycol, 'hello world' ) > 0;
• For a JSON column jtext, with textual attribute 'description'
create search index myind on mytab (jtext) for json;
• Query it with
select * from mytab where json_textcontains(jtext, '$.description', 'hello world');
Copyright © 2019 Oracle and/or its affiliates.6
Machine learning with a full-text index
• Oracle Text integrates tightly with Oracle Machine Learning (formerly Advanced Analytics)
• Clustering
• Group documents based on shared attributes (words, phrases, themes, etc)
• Classification
• Train the system to recognize particular topic areas
• Sentiment Analysis
• Identify documents which are positive, negative or neutral in tone
• Optionally can specify a topic for sentiment analysis
• eg. Are these restaurant reviews positive about the service (as compared to, say, the food
quality)
Copyright © 2019 Oracle and/or its affiliates.7
Oracle Database with SQL/JSON Support
– bridges the gap between SQL and NoSQL
Database
• JSON Documents are storable, queryable, searchable, updatable,
generatable in Oracle Database
• as varchar, clob, blob Datatype since 12c
• as JSON DataType with efficient binary format in next major release
• Combine schema flexibility of JSON with strengths of the relational model
in one Database System
• Support SQL/JSON Standard – Extension of SQL to query JSON
• Support Partial Update of JSON
• ACID Transaction Model applied to JSON – No Data Loss
Confidential – Oracle Internal/Restricted/Highly Restricted 8
Public9
Oracle as a Document Store
{ "doctype" : "JSON", "count" : 100 }
Oracle Database
Document Collections
Text and JSON Documents
Stored and Managed
Using Oracle DatabaseJSON Data
Public
1
0
All the power of SQL. All the flexibility of schemaless development.
Oracle Database - JSON document store
JSON
Applications
developed
using SODA
APIs
JSON Documents
Stored and Managed
Using Oracle Database
SQL based reporting
and analytical operations
on JSON Documents
Oracle Database 20c
SQL
JSON Query – Oracle Simplified Syntax
Oracle supports a non-standard simplified syntax
SQL> select j.PO_DOCUMENT
2 from J_PURCHASEORDER j
3 where j.PO_DOCUMENT.PONumber = 1600
4 /
SQL> select j.*
2 from CUSTOMER c,
3 JSON_TABLE (c.jcol.orders.lineitems[*]
4 COLUMNS (lineid, quantity, prodid, upc, comments)) j
5 /
SQL> select JSON {cid, firstname, lastname, street, country, zip}
2 from CUSTOMERS
Field Access
Collection
unnesting
JSON Generation
JSON Query – SQL/JSON
SQL/JSON standard
• Joint standard with IBM
• SQL extended with new operators for json e.g. JSON_VALUE
• All operators use JSON Path language for intra document navigation
Confidential – Oracle Internal/Restricted/Highly Restricted
SQL> select JSON_VALUE(PO_DOCUMENT, ‘$.LineItems[0].Part.UnitPrice’
2 returning NUMBER(5,3))
3 from J_PURCHASEORDER p
4 where JSON_VALUE(PO_DOCUMENT, ‘$.PONumber’
5 returning NUMBER(10)) = 1600
JSON Generation - Embedding arrays in documents
Oracle Confidential – Internal/Restricted/Highly Restricted 13
select JSON_OBJECT(
'departmentId' is d.DEPARTMENT_ID,
'name' is d. DEPARTMENT_NAME,
'employees' is (
select JSON_ARRAYAGG(
JSON_OBJECT(
'employeeId' is EMPLOYEE_ID,
'firstName' is FIRST_NAME,
'lastName' is LAST_NAME,
'emailAddress' is EMAIL
)
)
from HR.EMPLOYEES e
where e.DEPARTMENT_ID = d.DEPARTMENT_ID
)
) DEPT_WITH_EMPLOYEES
from HR.DEPARTMENTS d
where DEPARTMENT_NAME = 'Executive'
/
DEPT_WITH_EMPLOYEES
---------------------------------------------------------
-----------------------
{
"departmentId": 90,
"name": "Executive",
"employees": [
{
"employeeId": 100,
"firstName": "Steven",
"lastName": "King",
"emailAddress": "SKING"
}, {
"employeeId": 101,
"firstName": "Neena",
"lastName": "Kochhar",
"emailAddress": "NKOCHHAR"
}, {
"employeeId": 102,
"firstName": "Lex",
"lastName": "De Haan",
"emailAddress": "LDEHAAN"
}
]
}
select JSON_OBJECT(*)
from EMP;
JSON Indexes
• Functional Index
• Index specific fields within a JSON
document
• Search Index
• Universal Index for all fields
• Supports value, range and full
text
• GeoSpatial Index
• JSON_VALUE returns GeoJSON
as SDO_GEOMETRY object
Oracle Confidential – Internal/Restricted/Highly Restricted 14
CREATE SEARCH INDEX po_search_index
ON j_purchaseorder (po_document)
FOR JSON
CREATE UNIQUE INDEX po_ponum_index
ON j_purchaseorder po
(po.po_document.PONumber);
Discover metadata for JSON
• Generate JSON Schema or
• Generate relational schema
Derived relational views
• Declarative procedures to construct
a relational view over a JSON
fragment
Derived virtual columns
• Generated for singleton JSON keys
• Automatically generated for new
keys
Can be used with external data
JSON Dataguide - Schema Discovery
• SQL> select
dbms_json.get_index_DataGuide("REVIEWS',
• 2 'JTEXT',
• 3 dbms_json.FORMAT_HIERARCHICAL,
• 4 dbms_json.PRETTY)
• 5 from dual
{
"type" : "object",
"o:length" : 8192,
"properties" :
{
"text" :
{
"type" : "string",
"o:length" : 1024,
"o:preferred_column_name" : "JTEXT$text"
},
"stars" :
{
"type" : "number",
"o:length" : 4,
"o:preferred_column_name" : "JTEXT$stars"
},
}
}
MongoDB
• data pipelines to move out data for
OLTP, analytics, search, …
• Multiple databases to maintain, patch
• Multiple datasets to backup, administer
• Integration soaks up 30% of project costs
Oracle Database
• Converged Architecture
• All your data managed together
Oracle Database – Converged Data
Relational Full Text
OracleDBmongoDB ElasticSearch …….
…
Oracle
Database
20c
Relation
al
Full Text ….
Microservices with Multitenant
• Multitenant allows creating PDB
for each microservice
• Each PDB offers isolation and
can be scaled independently
• But still preserve unified
administration at the CDB level
Multimodel AND Polyglot
• Each PDB can be used as a
multimodel or specialized store
Oracle Database - Converged Data
Oracle Database
20c
RelationalMultimodel
Micro-
service
Micro-
service
Micro-
service
Workshop Overview
• Loading, indexing and searching for
recommendations in Text and JSON
• On GitHub: Oracle Learning Library
https://oracle.github.io/learning-library/data-
management-library/database/json/freetier/
Copyright © 2019 Oracle and/or its affiliates.
Workshop Overview
• Aim: Create a simple search microservice
• Find restaurants reviews in the local area that mention particular terms
• eg. "Show me all the reviews for restaurants with zip code "9160%"
which mention "great sushi"
• Workshop will work with the "Yelp" dataset and develop a REST-based
microservice with minimal coding
• Everything will be done using on-line cloud services – no client tools, no
IDE needed
Copyright © 2019 Oracle and/or its affiliates.19
Overview : YELP dataset
• Yelp is a Google-owned website for US business reviews
(shops, restaurants, gyms, etc.)
• Dataset is publicly available for educational, research and personal use
• JSON format
• > 8 million reviews
• > 200,000 businesses
• ~ 2 million users
Copyright © 2019 Oracle and/or its affiliates.
Workshop Steps
• Upload JSON files to object storage
• Create tables with JSON columns in database
• Copy JSON data from object storage to database tables
• Create indexes on JSON data
• Run queries against JSON data
• Create simple REST API on queries via Oracle Application Express (APEX)
Copyright © 2019 Oracle and/or its affiliates.
Lab 2: Upload to Oracle Object Storage
• Could upload directly from developer's PC to database
• Requires Oracle Instant Client on PC and download of wallet
• We're doing everything on the cloud so first we need to upload our files to the "cloud file system" –
Oracle Object Storage.
1. Download the Yelp dataset
2. Create object storage 'bucket'
3. Drag-and-drop our files to the bucket
Copyright © 2019 Oracle and/or its affiliates.
Lab 3: Loading Oracle Database 19c from Object Storage
• We will provision an instance of Autonomous Transaction Processing – an Autonomous Database
• In the ATP database we create simple tables with a JSON column in each
• We then load each table from object storage.
• So the steps are:
1. Provision an Autonomous Transaction Processing Database
2. Connect to SQL Developer Web and create a new user
3. Log in to SDW as the new user and create our tables
eg. create table businesses ( jtext clob constraint busjson check (jtext is json) );
4. Use DBMS_CLOUD.COPY_DATA to load the tables from object storage
Copyright © 2019 Oracle and/or its affiliates.
Lab 4: Creating indexes and basic queries
• Indexes for JSON take two forms
• FUNCTIONAL indexes to index specific values
• SEARCH indexes to index all values, provide full-text search and optional dataguide
• Step 2: Create functional indexes
• Used when joining tables
• Step 2: Create SEARCH index
• Queries can be run from SQL or from SODA – Simple Oracle Document Architecture
• We won't cover SODA here
• Step 3: Queries
• Create various queries of increasing complexity until we've satisfied our aim:
• "Find all businesses in ZIP codes 8911% which mention "sushi"
Copyright © 2019 Oracle and/or its affiliates.
Full Text Queries - 1
• Simple query on one table:
select r.jtext.user_id, r.jtext.text
from reviews r
where json_textcontains(jtext, '$.text', 'great sushi')
Copyright © 2019 Oracle and/or its affiliates.25
Full Text Queries - 2
• Join between two tables:
select u.jtext.name, r.jtext.text
from reviews r, users u
where json_textcontains(r.jtext, '$.text', 'sushi')
and u.jtext.user_id = r.jtext.user_id;
Copyright © 2019 Oracle and/or its affiliates.26
Full Text Queries - 3
• Full query to join three tables, with column aliases
select u.jtext.name username,
b.jtext.name businessname,
r.jtext.stars rating,
b.jtext.postal_code zip,
r.jtext.text review text
from reviews r, users u, businesses b
where json_textcontains(r.jtext, '$.text', 'sushi')
and u.jtext.user_id = r.jtext.user_id
and b.jtext.business_id = r.jtext.business_id
and b.jtext.postal_code like '8911%';
Copyright © 2019 Oracle and/or its affiliates.27
Lab 5: Creating a RESTful interface
• The final step in our microservices project is to add a REST front-end
• Oracle Application Express (APEX) makes this almost trivially simple
• Step 1: Create an APEX workspace
• Step 2: Create a RESTful module, template and handler
• We will demonstrate the simple "query collection" interface and the PL/SQL procedure interface
Copyright © 2019 Oracle and/or its affiliates.
Copyright © 2019 Oracle and/or its affiliates.29
31 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date]
Coming up at 12pm ET…
Maps and spatial
analyses: How to use
Them
with
Jayant Sharma &
Nick Salem
Breaktime!
Please complete the feedback
form for the previous session
Join us on the Database@Home
slack channel for more
conversations and to answer
your questions on the database
and labs
https://bit.ly/dbhome-slack

More Related Content

What's hot

Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15Dave Segleau
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationTammy Bednar
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Tammy Bednar
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewDave Segleau
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Anuj Sahni
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014Anuj Sahni
 
Hybrid Cloud Keynote
Hybrid Cloud Keynote Hybrid Cloud Keynote
Hybrid Cloud Keynote gcamarda
 
Oracle database 12c_and_DevOps
Oracle database 12c_and_DevOpsOracle database 12c_and_DevOps
Oracle database 12c_and_DevOpsMaria Colgan
 
Oracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONOracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONMarkus Michalewicz
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACMarkus Michalewicz
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseSandesh Rao
 
Oracle Cloud Infrastructure (OCI)
Oracle Cloud Infrastructure (OCI)Oracle Cloud Infrastructure (OCI)
Oracle Cloud Infrastructure (OCI)emmajones88
 
Spotlight private dns-oraclecloudservices
Spotlight private dns-oraclecloudservicesSpotlight private dns-oraclecloudservices
Spotlight private dns-oraclecloudservicesTammy Bednar
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...Daniel Martin
 
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...Markus Michalewicz
 
Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...EDB
 
Oracle RAC - Roadmap for New Features
Oracle RAC - Roadmap for New FeaturesOracle RAC - Roadmap for New Features
Oracle RAC - Roadmap for New FeaturesMarkus Michalewicz
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesMarkus Michalewicz
 

What's hot (20)

Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
Oracle NoSQL Database -- Big Data Bellevue Meetup - 02-18-15
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through Migration
 
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
Database@Home : Data Driven Apps - Data-driven Microservices Architecture wit...
 
Oracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overviewOracle NoSQL Database release 3.0 overview
Oracle NoSQL Database release 3.0 overview
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0
 
A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014A practical introduction to Oracle NoSQL Database - OOW2014
A practical introduction to Oracle NoSQL Database - OOW2014
 
Hybrid Cloud Keynote
Hybrid Cloud Keynote Hybrid Cloud Keynote
Hybrid Cloud Keynote
 
Oracle database 12c_and_DevOps
Oracle database 12c_and_DevOpsOracle database 12c_and_DevOps
Oracle database 12c_and_DevOps
 
Oracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLONOracle RAC 19c and Later - Best Practices #OOWLON
Oracle RAC 19c and Later - Best Practices #OOWLON
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous Database
 
Oracle Cloud Infrastructure (OCI)
Oracle Cloud Infrastructure (OCI)Oracle Cloud Infrastructure (OCI)
Oracle Cloud Infrastructure (OCI)
 
Spotlight private dns-oraclecloudservices
Spotlight private dns-oraclecloudservicesSpotlight private dns-oraclecloudservices
Spotlight private dns-oraclecloudservices
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...
 
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...
Under the Hood of the Smartest Availability Features in Oracle's Autonomous D...
 
Oracle data integrator (odi) online training
Oracle data integrator (odi) online trainingOracle data integrator (odi) online training
Oracle data integrator (odi) online training
 
Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...Application Development & Database Choices: Postgres Support for non Relation...
Application Development & Database Choices: Postgres Support for non Relation...
 
Oracle RAC - Roadmap for New Features
Oracle RAC - Roadmap for New FeaturesOracle RAC - Roadmap for New Features
Oracle RAC - Roadmap for New Features
 
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c FeaturesBest Practices for the Most Impactful Oracle Database 18c and 19c Features
Best Practices for the Most Impactful Oracle Database 18c and 19c Features
 

Similar to Database@Home - Data Driven : Loading, Indexing, and Searching with Text and JSON

NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLEDB
 
MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL Brasil
 
MySQL Day Paris 2018 - MySQL JSON Document Store
MySQL Day Paris 2018 - MySQL JSON Document StoreMySQL Day Paris 2018 - MySQL JSON Document Store
MySQL Day Paris 2018 - MySQL JSON Document StoreOlivier DASINI
 
Integrate MongoDB & SQL data with a single REST API
Integrate MongoDB & SQL data with a single REST APIIntegrate MongoDB & SQL data with a single REST API
Integrate MongoDB & SQL data with a single REST APIEspresso Logic
 
A Step by Step Introduction to the MySQL Document Store
A Step by Step Introduction to the MySQL Document StoreA Step by Step Introduction to the MySQL Document Store
A Step by Step Introduction to the MySQL Document StoreDave Stokes
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured PostgresNoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured PostgresEDB
 
A Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - HabilelabsA Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - HabilelabsHabilelabs
 
Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document StoreRui Quelhas
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Dave Stokes
 
JSON in der Oracle Datenbank
JSON in der Oracle DatenbankJSON in der Oracle Datenbank
JSON in der Oracle DatenbankUlrike Schwinn
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital.AI
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumFrank Lyaruu
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverMongoDB
 
Manage online profiles with oracle no sql database tht10972 - v1.1
Manage online profiles with oracle no sql database   tht10972 - v1.1Manage online profiles with oracle no sql database   tht10972 - v1.1
Manage online profiles with oracle no sql database tht10972 - v1.1Robert Greene
 
REST Enabling Your Oracle Database
REST Enabling Your Oracle DatabaseREST Enabling Your Oracle Database
REST Enabling Your Oracle DatabaseJeff Smith
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big dataSteven Francia
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 

Similar to Database@Home - Data Driven : Loading, Indexing, and Searching with Text and JSON (20)

NoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQLNoSQL and Spatial Database Capabilities using PostgreSQL
NoSQL and Spatial Database Capabilities using PostgreSQL
 
MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017MySQL como Document Store PHP Conference 2017
MySQL como Document Store PHP Conference 2017
 
MySQL Day Paris 2018 - MySQL JSON Document Store
MySQL Day Paris 2018 - MySQL JSON Document StoreMySQL Day Paris 2018 - MySQL JSON Document Store
MySQL Day Paris 2018 - MySQL JSON Document Store
 
Integrate MongoDB & SQL data with a single REST API
Integrate MongoDB & SQL data with a single REST APIIntegrate MongoDB & SQL data with a single REST API
Integrate MongoDB & SQL data with a single REST API
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
A Step by Step Introduction to the MySQL Document Store
A Step by Step Introduction to the MySQL Document StoreA Step by Step Introduction to the MySQL Document Store
A Step by Step Introduction to the MySQL Document Store
 
NoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured PostgresNoSQL on ACID - Meet Unstructured Postgres
NoSQL on ACID - Meet Unstructured Postgres
 
JSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge GraphsJSON-LD and SHACL for Knowledge Graphs
JSON-LD and SHACL for Knowledge Graphs
 
A Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - HabilelabsA Presentation on MongoDB Introduction - Habilelabs
A Presentation on MongoDB Introduction - Habilelabs
 
Node.js and the MySQL Document Store
Node.js and the MySQL Document StoreNode.js and the MySQL Document Store
Node.js and the MySQL Document Store
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 
JSON in der Oracle Datenbank
JSON in der Oracle DatenbankJSON in der Oracle Datenbank
JSON in der Oracle Datenbank
 
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and SparkVital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
Vital AI MetaQL: Queries Across NoSQL, SQL, Sparql, and Spark
 
Embracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and DebeziumEmbracing Database Diversity with Kafka and Debezium
Embracing Database Diversity with Kafka and Debezium
 
ACM BPM and elasticsearch AMIS25
ACM BPM and elasticsearch AMIS25ACM BPM and elasticsearch AMIS25
ACM BPM and elasticsearch AMIS25
 
Webinar: What's new in the .NET Driver
Webinar: What's new in the .NET DriverWebinar: What's new in the .NET Driver
Webinar: What's new in the .NET Driver
 
Manage online profiles with oracle no sql database tht10972 - v1.1
Manage online profiles with oracle no sql database   tht10972 - v1.1Manage online profiles with oracle no sql database   tht10972 - v1.1
Manage online profiles with oracle no sql database tht10972 - v1.1
 
REST Enabling Your Oracle Database
REST Enabling Your Oracle DatabaseREST Enabling Your Oracle Database
REST Enabling Your Oracle Database
 
NoSQL databases and managing big data
NoSQL databases and managing big dataNoSQL databases and managing big data
NoSQL databases and managing big data
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 

Recently uploaded

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 

Database@Home - Data Driven : Loading, Indexing, and Searching with Text and JSON

  • 1. Copyright © 2020, Oracle and/or its affiliates Loading, Indexing and Searching for Recommendations with Text and JSON Roger Ford Principal Product Manager Text & JSON Starts at 11am ET
  • 2. 2 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date] Roger Ford Principal Product Manager Roger Ford is a principle product manager at Oracle Corporation and has been with the company since version 5, in 1987. Roger manages a portfolio of products including Oracle Text, JSON in the database and the Database Scheduler. Currently he is spending most of his time planning the release of a new Autonomous JSON product. About your presenter: When not at work, Roger is usually found working on, or racing his Caterham sportscar. He helps manage the Caterham Graduates Racing Club, one of the biggest car racing clubs in the world.
  • 3. 3 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date]
  • 4. The case for Full Text Indexing • "Keyword" searching is a familiar concept from internet search engines and web stores • Yet many companies have massive amounts of valuable information in document formats which are currently not searchable • Office documents: MS Word, PDF, Powerpoint • Email archives • "How to" documents and training materials • The database can index these documents and provide full-text content searching • No need for a separate search engine • Combine relational and content searching via SQL • Full text search is available in all versions and editions of the database, and is not a chargable option Copyright © 2019 Oracle and/or its affiliates.
  • 5. The case for JSON development • Traditional SQL development requires pre-defined schema • Schema changes can be major task • Limited number of columns available (1000 in 19c) • JSON development increasingly popular • Schema-less, agile development style • Schema evolution is far less costly • Non-relational data model : hierarchical document/object model • Data access via many language API • Key/Value document collection API using QBE • Textual values in JSON can be full-text indexed Copyright © 2019 Oracle and/or its affiliates.
  • 6. Full Text Indexes • For text in VARCHAR2 and CLOB columns, eg. a CLOB column called mycol: create index myind on mytab(mycol) indextype is ctxsys.context; • Query it with select * from mytab where contains (mycol, 'hello world' ) > 0; • For a JSON column jtext, with textual attribute 'description' create search index myind on mytab (jtext) for json; • Query it with select * from mytab where json_textcontains(jtext, '$.description', 'hello world'); Copyright © 2019 Oracle and/or its affiliates.6
  • 7. Machine learning with a full-text index • Oracle Text integrates tightly with Oracle Machine Learning (formerly Advanced Analytics) • Clustering • Group documents based on shared attributes (words, phrases, themes, etc) • Classification • Train the system to recognize particular topic areas • Sentiment Analysis • Identify documents which are positive, negative or neutral in tone • Optionally can specify a topic for sentiment analysis • eg. Are these restaurant reviews positive about the service (as compared to, say, the food quality) Copyright © 2019 Oracle and/or its affiliates.7
  • 8. Oracle Database with SQL/JSON Support – bridges the gap between SQL and NoSQL Database • JSON Documents are storable, queryable, searchable, updatable, generatable in Oracle Database • as varchar, clob, blob Datatype since 12c • as JSON DataType with efficient binary format in next major release • Combine schema flexibility of JSON with strengths of the relational model in one Database System • Support SQL/JSON Standard – Extension of SQL to query JSON • Support Partial Update of JSON • ACID Transaction Model applied to JSON – No Data Loss Confidential – Oracle Internal/Restricted/Highly Restricted 8
  • 9. Public9 Oracle as a Document Store { "doctype" : "JSON", "count" : 100 } Oracle Database Document Collections Text and JSON Documents Stored and Managed Using Oracle DatabaseJSON Data
  • 10. Public 1 0 All the power of SQL. All the flexibility of schemaless development. Oracle Database - JSON document store JSON Applications developed using SODA APIs JSON Documents Stored and Managed Using Oracle Database SQL based reporting and analytical operations on JSON Documents Oracle Database 20c SQL
  • 11. JSON Query – Oracle Simplified Syntax Oracle supports a non-standard simplified syntax SQL> select j.PO_DOCUMENT 2 from J_PURCHASEORDER j 3 where j.PO_DOCUMENT.PONumber = 1600 4 / SQL> select j.* 2 from CUSTOMER c, 3 JSON_TABLE (c.jcol.orders.lineitems[*] 4 COLUMNS (lineid, quantity, prodid, upc, comments)) j 5 / SQL> select JSON {cid, firstname, lastname, street, country, zip} 2 from CUSTOMERS Field Access Collection unnesting JSON Generation
  • 12. JSON Query – SQL/JSON SQL/JSON standard • Joint standard with IBM • SQL extended with new operators for json e.g. JSON_VALUE • All operators use JSON Path language for intra document navigation Confidential – Oracle Internal/Restricted/Highly Restricted SQL> select JSON_VALUE(PO_DOCUMENT, ‘$.LineItems[0].Part.UnitPrice’ 2 returning NUMBER(5,3)) 3 from J_PURCHASEORDER p 4 where JSON_VALUE(PO_DOCUMENT, ‘$.PONumber’ 5 returning NUMBER(10)) = 1600
  • 13. JSON Generation - Embedding arrays in documents Oracle Confidential – Internal/Restricted/Highly Restricted 13 select JSON_OBJECT( 'departmentId' is d.DEPARTMENT_ID, 'name' is d. DEPARTMENT_NAME, 'employees' is ( select JSON_ARRAYAGG( JSON_OBJECT( 'employeeId' is EMPLOYEE_ID, 'firstName' is FIRST_NAME, 'lastName' is LAST_NAME, 'emailAddress' is EMAIL ) ) from HR.EMPLOYEES e where e.DEPARTMENT_ID = d.DEPARTMENT_ID ) ) DEPT_WITH_EMPLOYEES from HR.DEPARTMENTS d where DEPARTMENT_NAME = 'Executive' / DEPT_WITH_EMPLOYEES --------------------------------------------------------- ----------------------- { "departmentId": 90, "name": "Executive", "employees": [ { "employeeId": 100, "firstName": "Steven", "lastName": "King", "emailAddress": "SKING" }, { "employeeId": 101, "firstName": "Neena", "lastName": "Kochhar", "emailAddress": "NKOCHHAR" }, { "employeeId": 102, "firstName": "Lex", "lastName": "De Haan", "emailAddress": "LDEHAAN" } ] } select JSON_OBJECT(*) from EMP;
  • 14. JSON Indexes • Functional Index • Index specific fields within a JSON document • Search Index • Universal Index for all fields • Supports value, range and full text • GeoSpatial Index • JSON_VALUE returns GeoJSON as SDO_GEOMETRY object Oracle Confidential – Internal/Restricted/Highly Restricted 14 CREATE SEARCH INDEX po_search_index ON j_purchaseorder (po_document) FOR JSON CREATE UNIQUE INDEX po_ponum_index ON j_purchaseorder po (po.po_document.PONumber);
  • 15. Discover metadata for JSON • Generate JSON Schema or • Generate relational schema Derived relational views • Declarative procedures to construct a relational view over a JSON fragment Derived virtual columns • Generated for singleton JSON keys • Automatically generated for new keys Can be used with external data JSON Dataguide - Schema Discovery • SQL> select dbms_json.get_index_DataGuide("REVIEWS', • 2 'JTEXT', • 3 dbms_json.FORMAT_HIERARCHICAL, • 4 dbms_json.PRETTY) • 5 from dual { "type" : "object", "o:length" : 8192, "properties" : { "text" : { "type" : "string", "o:length" : 1024, "o:preferred_column_name" : "JTEXT$text" }, "stars" : { "type" : "number", "o:length" : 4, "o:preferred_column_name" : "JTEXT$stars" }, } }
  • 16. MongoDB • data pipelines to move out data for OLTP, analytics, search, … • Multiple databases to maintain, patch • Multiple datasets to backup, administer • Integration soaks up 30% of project costs Oracle Database • Converged Architecture • All your data managed together Oracle Database – Converged Data Relational Full Text OracleDBmongoDB ElasticSearch ……. … Oracle Database 20c Relation al Full Text ….
  • 17. Microservices with Multitenant • Multitenant allows creating PDB for each microservice • Each PDB offers isolation and can be scaled independently • But still preserve unified administration at the CDB level Multimodel AND Polyglot • Each PDB can be used as a multimodel or specialized store Oracle Database - Converged Data Oracle Database 20c RelationalMultimodel Micro- service Micro- service Micro- service
  • 18. Workshop Overview • Loading, indexing and searching for recommendations in Text and JSON • On GitHub: Oracle Learning Library https://oracle.github.io/learning-library/data- management-library/database/json/freetier/ Copyright © 2019 Oracle and/or its affiliates.
  • 19. Workshop Overview • Aim: Create a simple search microservice • Find restaurants reviews in the local area that mention particular terms • eg. "Show me all the reviews for restaurants with zip code "9160%" which mention "great sushi" • Workshop will work with the "Yelp" dataset and develop a REST-based microservice with minimal coding • Everything will be done using on-line cloud services – no client tools, no IDE needed Copyright © 2019 Oracle and/or its affiliates.19
  • 20. Overview : YELP dataset • Yelp is a Google-owned website for US business reviews (shops, restaurants, gyms, etc.) • Dataset is publicly available for educational, research and personal use • JSON format • > 8 million reviews • > 200,000 businesses • ~ 2 million users Copyright © 2019 Oracle and/or its affiliates.
  • 21. Workshop Steps • Upload JSON files to object storage • Create tables with JSON columns in database • Copy JSON data from object storage to database tables • Create indexes on JSON data • Run queries against JSON data • Create simple REST API on queries via Oracle Application Express (APEX) Copyright © 2019 Oracle and/or its affiliates.
  • 22. Lab 2: Upload to Oracle Object Storage • Could upload directly from developer's PC to database • Requires Oracle Instant Client on PC and download of wallet • We're doing everything on the cloud so first we need to upload our files to the "cloud file system" – Oracle Object Storage. 1. Download the Yelp dataset 2. Create object storage 'bucket' 3. Drag-and-drop our files to the bucket Copyright © 2019 Oracle and/or its affiliates.
  • 23. Lab 3: Loading Oracle Database 19c from Object Storage • We will provision an instance of Autonomous Transaction Processing – an Autonomous Database • In the ATP database we create simple tables with a JSON column in each • We then load each table from object storage. • So the steps are: 1. Provision an Autonomous Transaction Processing Database 2. Connect to SQL Developer Web and create a new user 3. Log in to SDW as the new user and create our tables eg. create table businesses ( jtext clob constraint busjson check (jtext is json) ); 4. Use DBMS_CLOUD.COPY_DATA to load the tables from object storage Copyright © 2019 Oracle and/or its affiliates.
  • 24. Lab 4: Creating indexes and basic queries • Indexes for JSON take two forms • FUNCTIONAL indexes to index specific values • SEARCH indexes to index all values, provide full-text search and optional dataguide • Step 2: Create functional indexes • Used when joining tables • Step 2: Create SEARCH index • Queries can be run from SQL or from SODA – Simple Oracle Document Architecture • We won't cover SODA here • Step 3: Queries • Create various queries of increasing complexity until we've satisfied our aim: • "Find all businesses in ZIP codes 8911% which mention "sushi" Copyright © 2019 Oracle and/or its affiliates.
  • 25. Full Text Queries - 1 • Simple query on one table: select r.jtext.user_id, r.jtext.text from reviews r where json_textcontains(jtext, '$.text', 'great sushi') Copyright © 2019 Oracle and/or its affiliates.25
  • 26. Full Text Queries - 2 • Join between two tables: select u.jtext.name, r.jtext.text from reviews r, users u where json_textcontains(r.jtext, '$.text', 'sushi') and u.jtext.user_id = r.jtext.user_id; Copyright © 2019 Oracle and/or its affiliates.26
  • 27. Full Text Queries - 3 • Full query to join three tables, with column aliases select u.jtext.name username, b.jtext.name businessname, r.jtext.stars rating, b.jtext.postal_code zip, r.jtext.text review text from reviews r, users u, businesses b where json_textcontains(r.jtext, '$.text', 'sushi') and u.jtext.user_id = r.jtext.user_id and b.jtext.business_id = r.jtext.business_id and b.jtext.postal_code like '8911%'; Copyright © 2019 Oracle and/or its affiliates.27
  • 28. Lab 5: Creating a RESTful interface • The final step in our microservices project is to add a REST front-end • Oracle Application Express (APEX) makes this almost trivially simple • Step 1: Create an APEX workspace • Step 2: Create a RESTful module, template and handler • We will demonstrate the simple "query collection" interface and the PL/SQL procedure interface Copyright © 2019 Oracle and/or its affiliates.
  • 29. Copyright © 2019 Oracle and/or its affiliates.29
  • 30.
  • 31. 31 Copyright © 2020, Oracle and/or its affiliates | Confidential: Internal/Restricted/Highly Restricted [Date] Coming up at 12pm ET… Maps and spatial analyses: How to use Them with Jayant Sharma & Nick Salem Breaktime! Please complete the feedback form for the previous session Join us on the Database@Home slack channel for more conversations and to answer your questions on the database and labs https://bit.ly/dbhome-slack