SlideShare uma empresa Scribd logo
1 de 23
Deepak Krishnan | Consultant - Data Scientist
❏ Expert on various Big Data and Machine Learning initiatives
❏ Experienced in schema design for Big Data storage systems
Praveen Rajasekhar | Director - Business Development
❏ <bio to be updated>
❏ <bio to be updated>
Speakers
Agenda
❏ Problem Statement
❏ Solution
❏ Summary
❏ Questions?
Problem Statement
User Experience
User Engagement
Search
Solutio
n
Solution
Solution
❏ Identify key operands & operators within natural language query
❏ Convert them into a series of connected expressions
❏ Dynamically build a query which runs against MongoDB instance
❏ Aggregate search results
[Revised]
Solution
[Revised]
Example:
Tokenizer
❏ Acts an FSA to access inverted index
❏ Emits annotations whenever a buffer matches an operator
❏ Ability to identify common data types such as date, time etc.
❏ Emits the matched expressions as a sequential stream of annotations
[Revised]
Tokenizer
Expression Parser
❏ Generated using parser generator
❏ Supports conjunction, disjunction, negation operators
❏ Responsible for taking in a stream of annotations and reducing it
❏ Creates the equivalent MongoDB query during reduction process
[Revised]
Expression Parser
Example: Show me Java or PHP openings
This will be reduced by
EXPR OR_OPERATOR EXPR
which has an RHS that will convert this to an OR query in MongoDB
External Knowledge Bases
❏ Integrated into the expression parser for data intelligence
❏ The application uses NLP date parsers, ConceptNet (knowledge bases)
❏ Improved data intelligence
[Revised]
Summary
Search API
❏ Acts as natural language quering modules
❏ Acts as a RESTful API endpoint to which clients can connect to via HTTP
Tokenizer
❏ Passes the stream of tokens to an expression parser
Summary
[Revised]
Expression Parser
❏ Uses series of tokens to make transitions in a finite state machine
❏ Ingestion of the tokens into the expression parser is based on a sliding
window model where the window size is dynamic
Summary
[Revised]
MongoDB Expertise at QBurst
MongoDB Expertise at QBurst
❏ Consulting – Strategy & Planning
❏ Solutions Architecting
❏ Design & Implementation
❏ Big Data Analytics & Integration
❏ Social Media Analytics & Solutions
❏ IoT Storage, Processing, and Prediction Solutions
Questions?
Thank You
Email: info@qburst.com
www.qburst.com
USA | UK | Poland | UAE | India | Singapore | Australia

Mais conteúdo relacionado

Mais procurados

Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
StreamNative
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 

Mais procurados (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
AWS Real-Time Event Processing
AWS Real-Time Event ProcessingAWS Real-Time Event Processing
AWS Real-Time Event Processing
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Warehouse Storage Managment System Database Schema
Warehouse Storage Managment System Database SchemaWarehouse Storage Managment System Database Schema
Warehouse Storage Managment System Database Schema
 
MongoDB Administration 101
MongoDB Administration 101MongoDB Administration 101
MongoDB Administration 101
 
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
 
kafka
kafkakafka
kafka
 
Apache Sedona: how to process petabytes of agronomic data with Spark
Apache Sedona: how to process petabytes of agronomic data with SparkApache Sedona: how to process petabytes of agronomic data with Spark
Apache Sedona: how to process petabytes of agronomic data with Spark
 
Spark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross LawleySpark Summit EU talk by Ross Lawley
Spark Summit EU talk by Ross Lawley
 
Stanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache SparkStanford CS347 Guest Lecture: Apache Spark
Stanford CS347 Guest Lecture: Apache Spark
 
Scaling Apache Pulsar to 10 Petabytes/Day
Scaling Apache Pulsar to 10 Petabytes/DayScaling Apache Pulsar to 10 Petabytes/Day
Scaling Apache Pulsar to 10 Petabytes/Day
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
How Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per dayHow Uber scaled its Real Time Infrastructure to Trillion events per day
How Uber scaled its Real Time Infrastructure to Trillion events per day
 
Enhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable StatisticsEnhancing Spark SQL Optimizer with Reliable Statistics
Enhancing Spark SQL Optimizer with Reliable Statistics
 
How Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBHow Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDB
 

Semelhante a Running Natural Language Queries on MongoDB

Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your Cluster
MongoDB
 
Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems
MongoDB
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
DataWorks Summit
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
VMware Tanzu
 

Semelhante a Running Natural Language Queries on MongoDB (20)

[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
[DSC Europe 23] Djordje Grozdic - Transforming Business Process Automation wi...
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
 
Resume
ResumeResume
Resume
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
 
MongoDB.local Sydney: An Introduction to Document Databases with MongoDB
MongoDB.local Sydney: An Introduction to Document Databases with MongoDBMongoDB.local Sydney: An Introduction to Document Databases with MongoDB
MongoDB.local Sydney: An Introduction to Document Databases with MongoDB
 
20160317 - PAZUR - PowerBI & R
20160317  - PAZUR - PowerBI & R20160317  - PAZUR - PowerBI & R
20160317 - PAZUR - PowerBI & R
 
Using Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your ClusterUsing Compass to Diagnose Performance Problems in Your Cluster
Using Compass to Diagnose Performance Problems in Your Cluster
 
Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems Using Compass to Diagnose Performance Problems
Using Compass to Diagnose Performance Problems
 
DataMass Summit - Machine Learning for Big Data in SQL Server
DataMass Summit - Machine Learning for Big Data  in SQL ServerDataMass Summit - Machine Learning for Big Data  in SQL Server
DataMass Summit - Machine Learning for Big Data in SQL Server
 
Optimize the Globalization Process with Cloudwords
Optimize the Globalization Process with CloudwordsOptimize the Globalization Process with Cloudwords
Optimize the Globalization Process with Cloudwords
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
 
Continuous delivery for machine learning
Continuous delivery for machine learningContinuous delivery for machine learning
Continuous delivery for machine learning
 
Ssas dmx ile kurum içi verilerin i̇şlenmesi
Ssas dmx ile kurum içi verilerin i̇şlenmesiSsas dmx ile kurum içi verilerin i̇şlenmesi
Ssas dmx ile kurum içi verilerin i̇şlenmesi
 
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AIRed Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
 
Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...Software engineering practices for the data science and machine learning life...
Software engineering practices for the data science and machine learning life...
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
 
Microservices as an evolutionary architecture: lessons learned
Microservices as an evolutionary architecture: lessons learnedMicroservices as an evolutionary architecture: lessons learned
Microservices as an evolutionary architecture: lessons learned
 
PostgreSQL as a Strategic Tool
PostgreSQL as a Strategic ToolPostgreSQL as a Strategic Tool
PostgreSQL as a Strategic Tool
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDBMongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
MongoDB World 2019: Near Real-Time Analytical Data Hub with MongoDB
 

Mais de MongoDB

Mais de MongoDB (20)

MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDBMongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
MongoDB .local Paris 2020: Les bonnes pratiques pour sécuriser MongoDB
 

Último

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Running Natural Language Queries on MongoDB

  • 1.
  • 2. Deepak Krishnan | Consultant - Data Scientist ❏ Expert on various Big Data and Machine Learning initiatives ❏ Experienced in schema design for Big Data storage systems Praveen Rajasekhar | Director - Business Development ❏ <bio to be updated> ❏ <bio to be updated> Speakers
  • 3. Agenda ❏ Problem Statement ❏ Solution ❏ Summary ❏ Questions?
  • 10. Solution ❏ Identify key operands & operators within natural language query ❏ Convert them into a series of connected expressions ❏ Dynamically build a query which runs against MongoDB instance ❏ Aggregate search results [Revised]
  • 13. ❏ Acts an FSA to access inverted index ❏ Emits annotations whenever a buffer matches an operator ❏ Ability to identify common data types such as date, time etc. ❏ Emits the matched expressions as a sequential stream of annotations [Revised] Tokenizer
  • 14. Expression Parser ❏ Generated using parser generator ❏ Supports conjunction, disjunction, negation operators ❏ Responsible for taking in a stream of annotations and reducing it ❏ Creates the equivalent MongoDB query during reduction process [Revised]
  • 15. Expression Parser Example: Show me Java or PHP openings This will be reduced by EXPR OR_OPERATOR EXPR which has an RHS that will convert this to an OR query in MongoDB
  • 16. External Knowledge Bases ❏ Integrated into the expression parser for data intelligence ❏ The application uses NLP date parsers, ConceptNet (knowledge bases) ❏ Improved data intelligence [Revised]
  • 18. Search API ❏ Acts as natural language quering modules ❏ Acts as a RESTful API endpoint to which clients can connect to via HTTP Tokenizer ❏ Passes the stream of tokens to an expression parser Summary [Revised]
  • 19. Expression Parser ❏ Uses series of tokens to make transitions in a finite state machine ❏ Ingestion of the tokens into the expression parser is based on a sliding window model where the window size is dynamic Summary [Revised]
  • 21. MongoDB Expertise at QBurst ❏ Consulting – Strategy & Planning ❏ Solutions Architecting ❏ Design & Implementation ❏ Big Data Analytics & Integration ❏ Social Media Analytics & Solutions ❏ IoT Storage, Processing, and Prediction Solutions
  • 23. Thank You Email: info@qburst.com www.qburst.com USA | UK | Poland | UAE | India | Singapore | Australia

Notas do Editor

  1. Modern-day software ecosystem is moving towards a better user experience. This has become critical for user retention. Organizations have realized that, more the user interacts with their application, higher the chances to appreciate the business value of the application. One of the most wanted feature of any user-centric application is the "Search" functionality.
  2. Modern-day software ecosystem is moving towards a better user experience. This has become critical for user retention. Organizations have realized that, more the user interacts with their application, higher the chances to appreciate the business value of the application. One of the most wanted feature of any user-centric application is the "Search" functionality.
  3. Modern-day software ecosystem is moving towards a better user experience. This has become critical for user retention. Organizations have realized that, more the user interacts with their application, higher the chances to appreciate the business value of the application. One of the most wanted feature of any user-centric application is the "Search" functionality.
  4. Our solution is designed to identify key operands and operators within a natural language query and convert them into a series of connected expressions; which are used to dynamically build a query which runs against a MongoDB instance. The final search results are aggregated and shown as standard results.
  5. Our solution is designed to identify key operands and operators within a natural language query and convert them into a series of connected expressions; which are used to dynamically build a query which runs against a MongoDB instance. The final search results are aggregated and shown as standard results.
  6. The tokenizer is by itself an FSA that has access to the inverted index Tokenizer emits annotations whenever its buffer matches an operator or an entry in the inverted index Tokenizer also has the ability to identify common data types such as date, time etc Tokenizer emits the matched expressions as a sequential stream of annotations
  7. The tokenizer is by itself an FSA that has access to the inverted index Tokenizer emits annotations whenever its buffer matches an operator or an entry in the inverted index Tokenizer also has the ability to identify common data types such as date, time etc Tokenizer emits the matched expressions as a sequential stream of annotations
  8. The expression parser is by itself generated by a parser generator It supports conjunction, disjunction, negation operators and has an associated precedence assigned The parser generator is responsible for taking in a stream of annotations and reducing it. During the reduction process, the expression parser creates the equivalent MongoDB query The process is similar to parsing an expression using a CFG and then at each step reducing the expression to a value (which in this case is the MongoDB query)
  9. -There are some external knowledge bases and parsers integrated into the Expression parser to make it more intelligent -We use custom NLP date parsers, knowledge bases such as conceptnet to increase the intelligence
  10. The expression parser by itself is a state machine that uses these series of tokens to make transitions in a finite state machine. Some tokens do cause the finite state machine to reach a final state, in which case expressions are built and stored. The ingestion of the tokens into the expression parser is based on a sliding window model where the window size is dynamic.