SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Ted Sullivan
Simpler Semantic Search
lucidworks.com
Senior Solutions Architect
Building Search Applications
Search is about Technology & Language

• These are difficult but also different problems
• Solving the “language problem” requires that we understand
how language is used in search
• We understand language at the semantic level - where
“meaning” or intent lives
• Search Engines deal with language at the syntactic level
• Most problems relating to search quality stem from this basic
“disconnect” – the “what” vs “what words” dichotomy
Better
^
Technology – Horizontal Concerns
Search applications share these requirements with other
information retrieval systems

• Performance – returning results in HTT (Human Tolerable Time)
• Scalability – being able to search “billions and billions”of documents
serving thousands or tens of thousands of users at a time.
• Reliability – fault tolerance, fail-over, redundancy
• Maintainability – easy to upgrade, search index can be kept current
in the face of rapidly changing content.
• Usability – User Experience is critical to success. UI and UX Mobile
Technology Is a Game Changer here!!!
Language – Vertical Concerns
These requirements are more specific to search systems.

• Accuracy – returning the “correct” results.
• Precision – few false positives
• Recall – few false negatives
• Relevance – returning the “best” results at the top
Returning the wrong results very fast is not

necessarily a good thing. Returning too many

results can affect performance.
Time flies like an arrow
Fruit flies like a banana
Our mental image for the second sentence depends
on how we “parse” it. It depends on what the subject
noun or noun phrase is.
The subject can be “fruit” or “fruit flies”. This
decision changes the verb which is either “flies”
or “like” respectively.
Fruit flies like a banana
Fruit flies like a banana
We can do this because we know that both “fruit” and
“fruit flies” represent single concepts – even though
“fruit flies” is two words – i.e. a “noun phrase”.
Fruit flies like a banana
Fruit flies like a banana
Search algorithms
and semantics
Tokenization plus vector mathematics

(TF/IDF or one of its cousins) – “bag-of-words”
Algorithmic tweaks – enhanced bag-of-words:

1. Some fields are more relevant than others
2. Hitting on more terms in the query is better than
hitting on fewer (token scores are summed)
3. The nearer the query terms are to each other in the
document the better – same order as query is best
4. Getting 0 results provides no feedback – OR is safer
than AND (we already have “fuzzy” & with bullet (2)
Problem: Search engines don’t
understand semantics
Better Search: Detecting Noun Phrases
Can algorithms be used to detect noun phrases?
Yes, but not perfectly and may need too much
CPU at query-time
Another way is to use knowledge bases – a lot of
extra work, but in some cases – we already have
one - the search index itself!
Better Search: Detecting Noun Phrases
The basic technique is called “autophrasing” –
recognizing when more than one word
represents just one thing.
Autophrasing – uses an extra knowledge-base
file “autophrases.txt”
Query Autofiltering – uses the phrases that are
stored as metadata values in the index.
Multi-term Synonym Problem
Subject was inspired by an old JIRA ticket: Lucene-1622

“if multi-word synonyms are indexed together with the original
token stream (at overlapping positions), then a query for a partial
synonym sequence (e.g., ‘big’ in the synonym ‘big apple’ for
‘new york city’) causes the document to match”
(or “apple” which will hit on my blog post if you crawl lucidworks.com !)
Sausagization
From Mike McCandless blog: Changing Bits: Lucene's TokenStreams are actually graphs!
• This means certain phrase queries should match but don't (e.g.: "hotspot is down"), and other phrase
queries shouldn't match but do (e.g.: "fast hotspot fi").
• Other cases do work correctly (e.g.: "fast hotspot"). We refer to this "lossy serialization" as sausagization,
because the incoming graph is unexpectedly turned from a correct word lattice into an incorrect sausage.
• This limitation is challenging to fix: it requires changing the index format (and Codec APIs) to store an
additional int position length per position, and then fixing positional queries to respect this value.
http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
Multi-term Synonym Demo
autophrases.txt
new york

new york state

empire state

new york city

new york new york

big apple

ny ny

city of new york

state of new york

ny state
synonyms.txt
new_york => new_york_state, new_york_city, big_apple,
new_york_new_york, ny_ny, nyc,empire_state,ny_state,
state_of_new_york
new_york_state,empire_state,ny_state, state_of_new_york
new_york_city,big_apple,new_york_new_york,

ny_ny,nyc, city_of_new_york
Multi-term Synonym Demo
This document is about new york state.
This document is about new york city.
There is a lot going on in NYC.
I heart the big apple.
The empire state is a great state.
New York, New York is a hellova town.
I am a native of the great state of New York.
New York New York City New York State
/select /autophrase
Multi-term Synonym Demo
This document is about new york state.
This document is about new york city.
There is a lot going on in NYC.
I heart the big apple.
The empire state is a great state.
New York, New York is a hellova town.
I am a native of the great state of New York.
Empire State
/select /autophrase
Query Autofiltering
Content Tagging and Intelligent Query
Filtering. Using the search index itself
as the knowledge source:
Search Index
Content
Content

Tagging
Auto FilteringQuery The Answer
Lucene FieldCache “In Action”
Standard “Inverted Index” (Lucene itself):
• Show all documents that have this term value in this field
• Used to get initial set of search result IDs
Uninverted or Forward Index (FieldCache):
• Show all term values that have been indexed in this field
• Can lookup term value for a doc ID
• Used to facet and get display values for doc IDs.
Query Autofiltering Implementation
Use Lucene FieldCache to build a map of field values
to field names (of string fields)
Add synonym mappings from synonyms.txt and
stemming to this value(s) -> field(s) map
Use this map to discover noun phrases in the query
that correspond to field values in the index – longest
contiguous phrase wins
Build filter or boost queries based on these
discovered mappings
QueryAutoFilteringComponent
Solr SearchComponent
github: https://github.com/LucidWorks/query-autofiltering-component
JIRA: SOLR-7539
<requestHandler name="/autofilter" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
<arr name="first-components">
<str>queryAutofiltering</str>
</arr>
</requestHandler>
<searchComponent name=“queryAutofiltering" class="org.apache.

solr.handler.component.QueryAutoFilteringComponent" />
Query

Autofiltering Demo
Hypothetical eCommerce App for a
Fictional department store
• Metadata has Noun Phrases!
<doc>
<field name="id">95</field>
<field name="product_type">sweat shirt</field>
<field name="product_category">shirt</field>
<field name="style">V neck</field>
<field name="style">short sleeve</field>
<field name="brand">J Crew</field>
<field name="color">grey</field>
<field name="material">cotton</field>
<field name="consumer_type">womens</field>
</doc>
<doc>
<field name="id">154</field>
<field name="product_type">crew socks</field>
<field name="product_category">socks</field>
<field name="color">white</field>
<field name="brand">Joe Boxer</field>
<field name="consumer_type">mens</field>
</doc>
<doc>
<field name="id">17</field>
<field name="product_type">boxer shorts</field>
<field name="product_category">underwear</field>
<field name="color">white</field>
<field name="brand">Fruit of the Loom</field>
<field name="consumer_type">mens</field>
</doc>
Query Autofiltering – Basic Behavior
q = red socks -> fq=color:red&fq=product_type:socks
or bq=(color:red AND product_type:socks)^20
q = Red Lion socks -> fq=brand:”Red Lion”&fq=product_type:socks
q = scarlet Chaise Lounge -> color:red AND product_type:”Lounge Chair”
q = white dress shirts -> color:white AND product_type:”dress shirt”
Dealing With “Unstructured” Text
This term ITSELF is evidence that we think of language as
unstructured when we know that it actually is not - It HAS to have
structure or we couldn’t communicate very well.
“The Lady Is A Tramp” vs “Lady And The Tramp”
Dealing with unstructured text means better handling of phrases.
Little words – like “if” can have big meaning!
Classification Technologies
Machine Learning
• Automated vs Semi-Automated
Natural Language Processing (NLP)
• Parts Of Speech
Taxonomy / Ontology
• Relationships
• Handles Phrases naturally
• Knows what is what and what is related to what!
Ontologies Designed for Search
Category Nodes – ‘parent’ nodes
that can have child nodes,
including:
• Sub Categories
• Evidence Nodes
Evidence Node – tend to be a leaf
nodes (with no children) and contain
keyterms (synonyms)
• May contain “rules” e.g. (if contains term a and
term b but not term c)
• Evidence Nodes can have more than one
category node parent
Hits on Evidence Nodes add to the cumulative score of a Category Node.
Scores can be diluted as they traverse the graph – so that the nearest
category gets the strongest ‘vote’.
Fortune 100 Companies
Energy
• Financial Services
• Investment Banks
• Commercial Banks
Health Care
• Health Insurance
• HMO
• Medical Devices
• Pharmaceuticals
Hospitality
Manufacturing
• Aircraft
• Automobiles
• Electrical Equipment
Corporations
• US
• British
• Chinese
• French
• German
• Japanese
• Russian
• +
Fortune 100 Companies
Energy
• Financial Services
• Investment Banks
• Commercial Banks
Health Care
• Health Insurance
• HMO
• Medical Devices
• Pharmaceuticals
Hospitality
Manufacturing
• Aircraft
• Automobiles
• Electrical Equipment
Corporations
• US
• British
• Chinese
• French
• German
• Japanese
• Russian
• +
The Basic Search “Use Case”
Traditional - Brief display – snippeting,

hyperlinks and paging
• Faceted Navigation
• Highlighting
• Need To RETHINK for Mobile!!!
Query Formulation 

–> Result Inspection

–> Query Refinement
Shortening The Loop
Query Suggestion (aka autocomplete,
typeahead)
• “Predictive” search
• Single field restriction
Recommendation
• Query – result – click – store – aggregate
• Boosting results or Suggesting queries
Best Bets (Query Elevation) – i.e. Punting
• Spotlighting
• Making it dynamic
Faceting
• Takes advantage of classification tagging
• Can be used to generate multi-field
phrases for suggestion
Inferential Search
• “I’m Feeling Lucky”
• Query Autofiltering
Enhanced Search: Pipelines
Document and Query Pre-Processing
Internal to Solr:
• Update Request Processor
• Data Import Handler (DIH)
• Search Component Chain
Big Data = Big Problem

or just a Big Opportunity:
• Hadoop – Solr
• Spark – Solr
• Morphlines
External to Solr:
• Custom ETL + SolrJ Integration
• Apache UIMA *
• DIH Client (SOLR-7188)
• Lucidworks Fusion
• Modular Informatic Designs framework
(coming soon to Open Source?)
Index Pipelines – Good Ole ETL + ______
Annotations!

Subject - Verb - Object
Entity Extractors – Identify Subject
and Object (noun phrases)
Annotations – mark locations of
entities in document
Discover Facts from Semantic Patterns
• $Person joined $Company
• $Drug is used to treat $Disease
• $Company acquired $Company
• $Person wrote $Song
Watson used IBM’s (now Apache’s) UIMA
(+40,000 PC’s)
Jeopardy is a “guess subject given object
and verb - posed as a question” – game
Who Needs Query Pipelines?
Who, What, Where, When:
• Security Filtering - Entitlements
• Dynamic Boost Block based on Preferences, Search History
• Geo Filtering – IP to geolocation
• Content Spotlighting based on time, place and search history
• Query Introspection – Infer User Intent
Lucidworks Fusion: Pipelines Proliferate
Documents and Queries are dynamic Metadata Objects
• PipelineDocument QueryRequestAndResponse respectively
Lots of Stages – more coming with every release
• Metadata -> metadata – lookup, clone, map, join
• Content -> metadata – extract, transform, classify
Index Pipelines: One-Way Query Pipelines: Round-Trip
• Both pre- and post-Query filtering opportunities
Connector

or Query
Stage Stage Stage Stage Solr Cloud
Thank you!
lucidworks.com
Ted Sullivan
Senior Solutions Architect

Mais conteúdo relacionado

Mais procurados

Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsTrey Grainger
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineTrey Grainger
 
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...Lucidworks
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Lucidworks
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Lucidworks
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Doing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters KluwerDoing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters KluwerLucidworks
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Lucidworks
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingSimon Hughes
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Lucidworks
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Lucidworks
 

Mais procurados (20)

Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
Vespa, A Tour
Vespa, A TourVespa, A Tour
Vespa, A Tour
 
Building a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engineBuilding a real time, solr-powered recommendation engine
Building a real time, solr-powered recommendation engine
 
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...
An Introduction to NLP4L - Natural Language Processing Tool for Apache Lucene...
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
Automatically Build Solr Synonyms List using Machine Learning - Chao Han, Luc...
 
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbul...
 
Haystacks slides
Haystacks slidesHaystacks slides
Haystacks slides
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Doing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters KluwerDoing Synonyms Right - John Marquiss, Wolters Kluwer
Doing Synonyms Right - John Marquiss, Wolters Kluwer
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
Search Accuracy Metrics and Predictive Analytics - A Big Data Use Case: Prese...
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
Vectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic MatchingVectors in Search - Towards More Semantic Matching
Vectors in Search - Towards More Semantic Matching
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
Reflected Intelligence - Lucene/Solr as a self-learning data system: Presente...
 
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
 

Destaque

Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Lucidworks
 
The Data-Drive Paradigm
The Data-Drive ParadigmThe Data-Drive Paradigm
The Data-Drive ParadigmLucidworks
 
Search in 2020: Presented by Will Hayes, Lucidworks
Search in 2020: Presented by Will Hayes, LucidworksSearch in 2020: Presented by Will Hayes, Lucidworks
Search in 2020: Presented by Will Hayes, LucidworksLucidworks
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaLucidworks
 
Webinar: Natural Language Search with Solr
Webinar: Natural Language Search with SolrWebinar: Natural Language Search with Solr
Webinar: Natural Language Search with SolrLucidworks
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Lucidworks
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 
Natural Language Search in Solr
Natural Language Search in SolrNatural Language Search in Solr
Natural Language Search in SolrTommaso Teofili
 
Spark Streaming and Expert Systems
Spark Streaming and Expert SystemsSpark Streaming and Expert Systems
Spark Streaming and Expert SystemsJim Haughwout
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbLucidworks
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformNavneet Gupta
 
It's Just Search: Presented by Erik Hatcher, Lucidworks
It's Just Search: Presented by Erik Hatcher, LucidworksIt's Just Search: Presented by Erik Hatcher, Lucidworks
It's Just Search: Presented by Erik Hatcher, LucidworksLucidworks
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Lucidworks
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksLucidworks
 
Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309DrVictorFang
 
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)Spark Summit
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Lucidworks
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoopskaluska
 
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...Amazon Web Services
 

Destaque (20)

Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
 
The Data-Drive Paradigm
The Data-Drive ParadigmThe Data-Drive Paradigm
The Data-Drive Paradigm
 
Search in 2020: Presented by Will Hayes, Lucidworks
Search in 2020: Presented by Will Hayes, LucidworksSearch in 2020: Presented by Will Hayes, Lucidworks
Search in 2020: Presented by Will Hayes, Lucidworks
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Webinar: Natural Language Search with Solr
Webinar: Natural Language Search with SolrWebinar: Natural Language Search with Solr
Webinar: Natural Language Search with Solr
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
Natural Language Search in Solr
Natural Language Search in SolrNatural Language Search in Solr
Natural Language Search in Solr
 
Spark Streaming and Expert Systems
Spark Streaming and Expert SystemsSpark Streaming and Expert Systems
Spark Streaming and Expert Systems
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data Platform
 
It's Just Search: Presented by Erik Hatcher, Lucidworks
It's Just Search: Presented by Erik Hatcher, LucidworksIt's Just Search: Presented by Erik Hatcher, Lucidworks
It's Just Search: Presented by Erik Hatcher, Lucidworks
 
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
Near Real Time Indexing Kafka Messages into Apache Blur: Presented by Dibyend...
 
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, LucidworksVisualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
Visualize Solr Data with Banana: Presented by Andrew Thanalertvisuti, Lucidworks
 
Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309Video Analytics on Hadoop webinar victor fang-201309
Video Analytics on Hadoop webinar victor fang-201309
 
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)
Real Time Fuzzy Matching with Spark and Elastic Search-(Sonal Goyal, Nube)
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoop
 
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...
Building a Scalable Digital Asset Management Platform in the Cloud (MED402) |...
 

Semelhante a Webinar: Simpler Semantic Search with Solr

Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsDerek Kane
 
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksA Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksLucidworks
 
The search engine index
The search engine indexThe search engine index
The search engine indexCJ Jenkins
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
 
Search explained T3DD15
Search explained T3DD15Search explained T3DD15
Search explained T3DD15Hans Höchtl
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsBen DeMott
 
Distributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in PythonDistributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in PythonClare Corthell
 
Intro to Vectorization Concepts - GaTech cse6242
Intro to Vectorization Concepts - GaTech cse6242Intro to Vectorization Concepts - GaTech cse6242
Intro to Vectorization Concepts - GaTech cse6242Josh Patterson
 
Full text search
Full text searchFull text search
Full text searchdeleteman
 
The well tempered search application
The well tempered search applicationThe well tempered search application
The well tempered search applicationTed Sullivan
 
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USARelevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USALeonardo Dias
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and SparkAudible, Inc.
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to ElasticsearchClifford James
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAsad Abbas
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectorsOsebe Sammi
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Lucidworks
 
Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)muhammadmubinmacadad2
 

Semelhante a Webinar: Simpler Semantic Search with Solr (20)

Using topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic searchUsing topic modelling frameworks for NLP and semantic search
Using topic modelling frameworks for NLP and semantic search
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
 
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksA Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
 
The search engine index
The search engine indexThe search engine index
The search engine index
 
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesHaystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
 
Search explained T3DD15
Search explained T3DD15Search explained T3DD15
Search explained T3DD15
 
The need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementationsThe need for sophistication in modern search engine implementations
The need for sophistication in modern search engine implementations
 
Distributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in PythonDistributed Natural Language Processing Systems in Python
Distributed Natural Language Processing Systems in Python
 
Text Mining
Text MiningText Mining
Text Mining
 
Intro to Vectorization Concepts - GaTech cse6242
Intro to Vectorization Concepts - GaTech cse6242Intro to Vectorization Concepts - GaTech cse6242
Intro to Vectorization Concepts - GaTech cse6242
 
Ir 03
Ir   03Ir   03
Ir 03
 
Full text search
Full text searchFull text search
Full text search
 
The well tempered search application
The well tempered search applicationThe well tempered search application
The well tempered search application
 
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USARelevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
Relevancy and synonyms - ApacheCon NA 2013 - Portland, Oregon, USA
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
Pycon ke word vectors
Pycon ke   word vectorsPycon ke   word vectors
Pycon ke word vectors
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
 
Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)Object-Oriented Programming in Java (Module 1)
Object-Oriented Programming in Java (Module 1)
 

Mais de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Mais de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Último

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Webinar: Simpler Semantic Search with Solr

  • 1.
  • 2. Ted Sullivan Simpler Semantic Search lucidworks.com Senior Solutions Architect
  • 3. Building Search Applications Search is about Technology & Language
 • These are difficult but also different problems • Solving the “language problem” requires that we understand how language is used in search • We understand language at the semantic level - where “meaning” or intent lives • Search Engines deal with language at the syntactic level • Most problems relating to search quality stem from this basic “disconnect” – the “what” vs “what words” dichotomy Better ^
  • 4. Technology – Horizontal Concerns Search applications share these requirements with other information retrieval systems
 • Performance – returning results in HTT (Human Tolerable Time) • Scalability – being able to search “billions and billions”of documents serving thousands or tens of thousands of users at a time. • Reliability – fault tolerance, fail-over, redundancy • Maintainability – easy to upgrade, search index can be kept current in the face of rapidly changing content. • Usability – User Experience is critical to success. UI and UX Mobile Technology Is a Game Changer here!!!
  • 5. Language – Vertical Concerns These requirements are more specific to search systems.
 • Accuracy – returning the “correct” results. • Precision – few false positives • Recall – few false negatives • Relevance – returning the “best” results at the top Returning the wrong results very fast is not
 necessarily a good thing. Returning too many
 results can affect performance.
  • 6. Time flies like an arrow Fruit flies like a banana Our mental image for the second sentence depends on how we “parse” it. It depends on what the subject noun or noun phrase is.
  • 7. The subject can be “fruit” or “fruit flies”. This decision changes the verb which is either “flies” or “like” respectively. Fruit flies like a banana Fruit flies like a banana
  • 8. We can do this because we know that both “fruit” and “fruit flies” represent single concepts – even though “fruit flies” is two words – i.e. a “noun phrase”. Fruit flies like a banana Fruit flies like a banana
  • 9. Search algorithms and semantics Tokenization plus vector mathematics
 (TF/IDF or one of its cousins) – “bag-of-words” Algorithmic tweaks – enhanced bag-of-words:
 1. Some fields are more relevant than others 2. Hitting on more terms in the query is better than hitting on fewer (token scores are summed) 3. The nearer the query terms are to each other in the document the better – same order as query is best 4. Getting 0 results provides no feedback – OR is safer than AND (we already have “fuzzy” & with bullet (2) Problem: Search engines don’t understand semantics
  • 10. Better Search: Detecting Noun Phrases Can algorithms be used to detect noun phrases? Yes, but not perfectly and may need too much CPU at query-time Another way is to use knowledge bases – a lot of extra work, but in some cases – we already have one - the search index itself!
  • 11. Better Search: Detecting Noun Phrases The basic technique is called “autophrasing” – recognizing when more than one word represents just one thing. Autophrasing – uses an extra knowledge-base file “autophrases.txt” Query Autofiltering – uses the phrases that are stored as metadata values in the index.
  • 12. Multi-term Synonym Problem Subject was inspired by an old JIRA ticket: Lucene-1622
 “if multi-word synonyms are indexed together with the original token stream (at overlapping positions), then a query for a partial synonym sequence (e.g., ‘big’ in the synonym ‘big apple’ for ‘new york city’) causes the document to match” (or “apple” which will hit on my blog post if you crawl lucidworks.com !)
  • 13. Sausagization From Mike McCandless blog: Changing Bits: Lucene's TokenStreams are actually graphs! • This means certain phrase queries should match but don't (e.g.: "hotspot is down"), and other phrase queries shouldn't match but do (e.g.: "fast hotspot fi"). • Other cases do work correctly (e.g.: "fast hotspot"). We refer to this "lossy serialization" as sausagization, because the incoming graph is unexpectedly turned from a correct word lattice into an incorrect sausage. • This limitation is challenging to fix: it requires changing the index format (and Codec APIs) to store an additional int position length per position, and then fixing positional queries to respect this value. http://blog.mikemccandless.com/2012/04/lucenes-tokenstreams-are-actually.html
  • 14. Multi-term Synonym Demo autophrases.txt new york
 new york state
 empire state
 new york city
 new york new york
 big apple
 ny ny
 city of new york
 state of new york
 ny state synonyms.txt new_york => new_york_state, new_york_city, big_apple, new_york_new_york, ny_ny, nyc,empire_state,ny_state, state_of_new_york new_york_state,empire_state,ny_state, state_of_new_york new_york_city,big_apple,new_york_new_york,
 ny_ny,nyc, city_of_new_york
  • 15. Multi-term Synonym Demo This document is about new york state. This document is about new york city. There is a lot going on in NYC. I heart the big apple. The empire state is a great state. New York, New York is a hellova town. I am a native of the great state of New York. New York New York City New York State /select /autophrase
  • 16. Multi-term Synonym Demo This document is about new york state. This document is about new york city. There is a lot going on in NYC. I heart the big apple. The empire state is a great state. New York, New York is a hellova town. I am a native of the great state of New York. Empire State /select /autophrase
  • 17. Query Autofiltering Content Tagging and Intelligent Query Filtering. Using the search index itself as the knowledge source: Search Index Content Content
 Tagging Auto FilteringQuery The Answer
  • 18. Lucene FieldCache “In Action” Standard “Inverted Index” (Lucene itself): • Show all documents that have this term value in this field • Used to get initial set of search result IDs Uninverted or Forward Index (FieldCache): • Show all term values that have been indexed in this field • Can lookup term value for a doc ID • Used to facet and get display values for doc IDs.
  • 19. Query Autofiltering Implementation Use Lucene FieldCache to build a map of field values to field names (of string fields) Add synonym mappings from synonyms.txt and stemming to this value(s) -> field(s) map Use this map to discover noun phrases in the query that correspond to field values in the index – longest contiguous phrase wins Build filter or boost queries based on these discovered mappings
  • 20. QueryAutoFilteringComponent Solr SearchComponent github: https://github.com/LucidWorks/query-autofiltering-component JIRA: SOLR-7539 <requestHandler name="/autofilter" class="solr.SearchHandler"> <lst name="defaults"> <str name="echoParams">explicit</str> <int name="rows">10</int> <str name="df">text</str> </lst> <arr name="first-components"> <str>queryAutofiltering</str> </arr> </requestHandler> <searchComponent name=“queryAutofiltering" class="org.apache.
 solr.handler.component.QueryAutoFilteringComponent" />
  • 21. Query
 Autofiltering Demo Hypothetical eCommerce App for a Fictional department store • Metadata has Noun Phrases! <doc> <field name="id">95</field> <field name="product_type">sweat shirt</field> <field name="product_category">shirt</field> <field name="style">V neck</field> <field name="style">short sleeve</field> <field name="brand">J Crew</field> <field name="color">grey</field> <field name="material">cotton</field> <field name="consumer_type">womens</field> </doc> <doc> <field name="id">154</field> <field name="product_type">crew socks</field> <field name="product_category">socks</field> <field name="color">white</field> <field name="brand">Joe Boxer</field> <field name="consumer_type">mens</field> </doc> <doc> <field name="id">17</field> <field name="product_type">boxer shorts</field> <field name="product_category">underwear</field> <field name="color">white</field> <field name="brand">Fruit of the Loom</field> <field name="consumer_type">mens</field> </doc>
  • 22. Query Autofiltering – Basic Behavior q = red socks -> fq=color:red&fq=product_type:socks or bq=(color:red AND product_type:socks)^20 q = Red Lion socks -> fq=brand:”Red Lion”&fq=product_type:socks q = scarlet Chaise Lounge -> color:red AND product_type:”Lounge Chair” q = white dress shirts -> color:white AND product_type:”dress shirt”
  • 23. Dealing With “Unstructured” Text This term ITSELF is evidence that we think of language as unstructured when we know that it actually is not - It HAS to have structure or we couldn’t communicate very well. “The Lady Is A Tramp” vs “Lady And The Tramp” Dealing with unstructured text means better handling of phrases. Little words – like “if” can have big meaning!
  • 24. Classification Technologies Machine Learning • Automated vs Semi-Automated Natural Language Processing (NLP) • Parts Of Speech Taxonomy / Ontology • Relationships • Handles Phrases naturally • Knows what is what and what is related to what!
  • 25. Ontologies Designed for Search Category Nodes – ‘parent’ nodes that can have child nodes, including: • Sub Categories • Evidence Nodes Evidence Node – tend to be a leaf nodes (with no children) and contain keyterms (synonyms) • May contain “rules” e.g. (if contains term a and term b but not term c) • Evidence Nodes can have more than one category node parent Hits on Evidence Nodes add to the cumulative score of a Category Node. Scores can be diluted as they traverse the graph – so that the nearest category gets the strongest ‘vote’.
  • 26. Fortune 100 Companies Energy • Financial Services • Investment Banks • Commercial Banks Health Care • Health Insurance • HMO • Medical Devices • Pharmaceuticals Hospitality Manufacturing • Aircraft • Automobiles • Electrical Equipment Corporations • US • British • Chinese • French • German • Japanese • Russian • +
  • 27. Fortune 100 Companies Energy • Financial Services • Investment Banks • Commercial Banks Health Care • Health Insurance • HMO • Medical Devices • Pharmaceuticals Hospitality Manufacturing • Aircraft • Automobiles • Electrical Equipment Corporations • US • British • Chinese • French • German • Japanese • Russian • +
  • 28. The Basic Search “Use Case” Traditional - Brief display – snippeting,
 hyperlinks and paging • Faceted Navigation • Highlighting • Need To RETHINK for Mobile!!! Query Formulation 
 –> Result Inspection
 –> Query Refinement
  • 29. Shortening The Loop Query Suggestion (aka autocomplete, typeahead) • “Predictive” search • Single field restriction Recommendation • Query – result – click – store – aggregate • Boosting results or Suggesting queries Best Bets (Query Elevation) – i.e. Punting • Spotlighting • Making it dynamic Faceting • Takes advantage of classification tagging • Can be used to generate multi-field phrases for suggestion Inferential Search • “I’m Feeling Lucky” • Query Autofiltering
  • 30. Enhanced Search: Pipelines Document and Query Pre-Processing Internal to Solr: • Update Request Processor • Data Import Handler (DIH) • Search Component Chain Big Data = Big Problem
 or just a Big Opportunity: • Hadoop – Solr • Spark – Solr • Morphlines External to Solr: • Custom ETL + SolrJ Integration • Apache UIMA * • DIH Client (SOLR-7188) • Lucidworks Fusion • Modular Informatic Designs framework (coming soon to Open Source?)
  • 31. Index Pipelines – Good Ole ETL + ______ Annotations!
 Subject - Verb - Object Entity Extractors – Identify Subject and Object (noun phrases) Annotations – mark locations of entities in document Discover Facts from Semantic Patterns • $Person joined $Company • $Drug is used to treat $Disease • $Company acquired $Company • $Person wrote $Song Watson used IBM’s (now Apache’s) UIMA (+40,000 PC’s) Jeopardy is a “guess subject given object and verb - posed as a question” – game
  • 32. Who Needs Query Pipelines? Who, What, Where, When: • Security Filtering - Entitlements • Dynamic Boost Block based on Preferences, Search History • Geo Filtering – IP to geolocation • Content Spotlighting based on time, place and search history • Query Introspection – Infer User Intent
  • 33. Lucidworks Fusion: Pipelines Proliferate Documents and Queries are dynamic Metadata Objects • PipelineDocument QueryRequestAndResponse respectively Lots of Stages – more coming with every release • Metadata -> metadata – lookup, clone, map, join • Content -> metadata – extract, transform, classify Index Pipelines: One-Way Query Pipelines: Round-Trip • Both pre- and post-Query filtering opportunities Connector
 or Query Stage Stage Stage Stage Solr Cloud