SlideShare uma empresa Scribd logo
1 de 96
“ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
search for : “ DELETE QUERY ”  We can
search for : “ DELETE QUERY ”  and find : “ deleteByQuery ” We can
but you can only find  what is stored in the database
Normalise values  “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
Normalise values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Normalise  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
Analyse  values  and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
What is stored in ElasticSearch?
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Document:
{ tweet   => "Perl is GREAT!", posted  => "2011-08-15", user   => { name   => "Clinton Gormley", email  => "drtech@cpan.org", }, tags   => [" perl" ,"opinion"],  posts   => 2, } Fields:
{ tweet  =>  "Perl is GREAT!", posted =>  "2011-08-15", user  =>  { name  =>  "Clinton Gormley", email =>  "drtech@cpan.org", }, tags   =>  [" perl" ,"opinion"],  posts  =>  2, } Values:
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user  => { name  => "Clinton Gormley", email => "drtech@cpan.org" }, tags  => [" perl" ,"opinion"],  posts  => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
{ tweet  => "Perl is GREAT!", posted => "2011-08-15", user   => { name  => "Clinton Gormley", email  => "drtech@cpan.org", }, tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened:
{ tweet  => "Perl is GREAT!", posted  => "2011-08-15", user.name  => "Clinton Gormley", user.email  => "drtech@cpan.org", tags  => [" perl" ,"opinion"],  posts  => 2, } Nested objects flattened
{ tweet  =>  "Perl is GREAT!", posted  =>  "2011-08-15", user.name  =>  "Clinton Gormley", user.email =>  "drtech@cpan.org", tags  =>  [" perl" ,"opinion"],  posts  =>  2, } Values analyzed into terms
{ tweet  =>  ['perl','great'], posted  =>  [Date(2011-08-15)], user.name  =>  ['clinton','gormley'], user.email =>  ['drtech','cpan.org'], tags  =>  [' perl' ,'opinion'],  posts  =>  [2], } Values analyzed into terms
database table row ⇒  many tables ⇒  many rows ⇒  one schema ⇒  many columns In MySQL
index type document ⇒  many types ⇒  many documents ⇒  one mapping ⇒  many fields In ElasticSearch
Create index with mappings $es-> create_index ( index  => 'twitter', mappings   => { tweet   => { properties  => { title  => { type => 'string' }, created => { type => 'date'  } }  } } );
Add a mapping $es-> put_mapping (  index => 'twitter', type  => ' user ', mapping   => { properties  => { name  => { type => 'string' }, created => { type => 'date'  }, }  } );
Can add to existing mapping
Can add to existing mapping Cannot change mapping for field
Core field types { type  => 'string', }
Core field types { type  => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
Core field types { type  => 'string', index  => ' analyzed ', # 'Foo Bar'  ⇒  [ 'foo', 'bar' ] }
Core field types { type  => 'string', index  => ' not_analyzed ', # 'Foo Bar'  ⇒  [ 'Foo Bar' ] }
Core field types { type  => 'string', index  => ' no ', # 'Foo Bar'  ⇒  [ ] }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', }
Core field types { type  => 'string', index  => 'analyzed', index_ analyzer  => 'default', search_ analyzer => 'default', }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, }
Core field types { type  => 'string', index  => 'analyzed', analyzer  => 'default', boost  => 2, include_in_all  => 1 |0 }
[object Object]
Simple
Whitespace
Stop
Keyword Built in analyzers ,[object Object]
Language
Snowball
Custom
The Brown-Cow's Part_No.  #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
Token filters ,[object Object]
ASCII Folding
Length
Lowercase
NGram
Edge NGram
Porter Stem
Shingle
Stop
Word Delimiter ,[object Object]
KStem
Snowball
Phonetic
Synonym
Compound Word
Reverse
Elision
Truncate
Unique
Custom Analyzer $c->create_index( index  => 'twitter', settings  => { analysis => { analyzer => { ascii_html => { type  => 'custom', tokenizer  => 'standard', filter  => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  );
Searching $result = $es->search( index  =>  ['twitter','facebook'] , type  =>  ['tweet','post'] ,  );
Searching $result = $es->search( #  all indices #  all types );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet',  query  => { text => { _all => 'foo' }}, );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query b   =>  'foo' , #  b == ElasticSearch::SearchBuilder );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] );
Searching $result = $es->search( index  => 'twitter', type  => 'tweet', query  => { text => { _all => 'foo' }}, sort  => [{ '_score': 'desc' }] from  => 0, size  => 10, );
Query DSL
Queries   vs  Filters
Queries   vs  Filters  ,[object Object],[object Object]
Queries   vs  Filters  ,[object Object]
relevance scoring ,[object Object]
no scoring
Queries   vs  Filters  ,[object Object]
relevance scoring
slower ,[object Object]
no scoring
faster
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable
Queries   vs  Filters  ,[object Object]
relevance scoring
slower
no caching ,[object Object]
no scoring
faster
cacheable  Use filters for anything that doesn't affect the relevance score!
Query only Query DSL: $es->search(  query => {  text => { title => 'perl' }  } ); SearchBuilder: $es->search(  query b  => {  title => 'perl'  } );
Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b  => { -filter => {  tag => 'perl'  } });
Query and filter Query DSL: $es->search( query => { filtered  => { query => {  text => { title => 'perl' } }, filter =>{  term => { tag => 'perl'  } } } }); SearchBuilder: $es->search( query b  => { title  => 'perl', -filter => {  tag => 'perl'  }  });

Mais conteúdo relacionado

Mais procurados

Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 

Mais procurados (20)

Building Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and HydraBuilding Next-Generation Web APIs with JSON-LD and Hydra
Building Next-Generation Web APIs with JSON-LD and Hydra
 
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIsHydra: A Vocabulary for Hypermedia-Driven Web APIs
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
 
ElasticSearch
ElasticSearchElasticSearch
ElasticSearch
 
엘라스틱 서치 세미나
엘라스틱 서치 세미나엘라스틱 서치 세미나
엘라스틱 서치 세미나
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
java script json
java script jsonjava script json
java script json
 
JSON-LD, Schema.org, and Structured data
JSON-LD, Schema.org, and Structured dataJSON-LD, Schema.org, and Structured data
JSON-LD, Schema.org, and Structured data
 
JSON
JSONJSON
JSON
 
Solr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW TechnologySolr Graph Query: Presented by Kevin Watters, KMW Technology
Solr Graph Query: Presented by Kevin Watters, KMW Technology
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Javascript Prototype Visualized
Javascript Prototype VisualizedJavascript Prototype Visualized
Javascript Prototype Visualized
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Introduction to boolean search
Introduction to boolean searchIntroduction to boolean search
Introduction to boolean search
 
Boolean Training
Boolean TrainingBoolean Training
Boolean Training
 
Learning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search GuildLearning to Rank Presentation (v2) at LexisNexis Search Guild
Learning to Rank Presentation (v2) at LexisNexis Search Guild
 
RDF data validation 2017 SHACL
RDF data validation 2017 SHACLRDF data validation 2017 SHACL
RDF data validation 2017 SHACL
 
Sourcing using boolean search and other tips 2014
Sourcing using boolean search and other tips 2014Sourcing using boolean search and other tips 2014
Sourcing using boolean search and other tips 2014
 
Understanding RDF: the Resource Description Framework in Context (1999)
Understanding RDF: the Resource Description Framework in Context  (1999)Understanding RDF: the Resource Description Framework in Context  (1999)
Understanding RDF: the Resource Description Framework in Context (1999)
 
JSON-LD for RESTful services
JSON-LD for RESTful servicesJSON-LD for RESTful services
JSON-LD for RESTful services
 

Semelhante a Terms of endearment - the ElasticSearch Query DSL explained

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Security
mussawir20
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testing
petersergeant
 
Intro python
Intro pythonIntro python
Intro python
kamzilla
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)
MongoSF
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05
Jeremy Kendall
 

Semelhante a Terms of endearment - the ElasticSearch Query DSL explained (20)

Php Basic Security
Php Basic SecurityPhp Basic Security
Php Basic Security
 
High-level Web Testing
High-level Web TestingHigh-level Web Testing
High-level Web Testing
 
Exploiting Php With Php
Exploiting Php With PhpExploiting Php With Php
Exploiting Php With Php
 
PHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the GoodPHP 102: Out with the Bad, In with the Good
PHP 102: Out with the Bad, In with the Good
 
HTML5 Web Forms
HTML5 Web FormsHTML5 Web Forms
HTML5 Web Forms
 
Intro python
Intro pythonIntro python
Intro python
 
JSP Custom Tags
JSP Custom TagsJSP Custom Tags
JSP Custom Tags
 
Sencha Touch Intro
Sencha Touch IntroSencha Touch Intro
Sencha Touch Intro
 
JQuery 101
JQuery 101JQuery 101
JQuery 101
 
Drupal Lightning FAPI Jumpstart
Drupal Lightning FAPI JumpstartDrupal Lightning FAPI Jumpstart
Drupal Lightning FAPI Jumpstart
 
JQuery Basics
JQuery BasicsJQuery Basics
JQuery Basics
 
03 Php Array String Functions
03 Php Array String Functions03 Php Array String Functions
03 Php Array String Functions
 
Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)Schema design with MongoDB (Dwight Merriman)
Schema design with MongoDB (Dwight Merriman)
 
Forum Presentation
Forum PresentationForum Presentation
Forum Presentation
 
Intro to #memtech PHP 2011-12-05
Intro to #memtech PHP   2011-12-05Intro to #memtech PHP   2011-12-05
Intro to #memtech PHP 2011-12-05
 
Cool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearchCool bonsai cool - an introduction to ElasticSearch
Cool bonsai cool - an introduction to ElasticSearch
 
Haml & Sass presentation
Haml & Sass presentationHaml & Sass presentation
Haml & Sass presentation
 
Ods Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A TutorialOds Markup And Tagsets: A Tutorial
Ods Markup And Tagsets: A Tutorial
 
Introduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid TagsIntroduction into Struts2 jQuery Grid Tags
Introduction into Struts2 jQuery Grid Tags
 
Mojolicious on Steroids
Mojolicious on SteroidsMojolicious on Steroids
Mojolicious on Steroids
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Terms of endearment - the ElasticSearch Query DSL explained

  • 1. “ Terms of Endearment” The ElasticSearch query language explained Clinton Gormley, YAPC::EU 2011 DRTECH @clintongormley
  • 2. search for : “ DELETE QUERY ” We can
  • 3. search for : “ DELETE QUERY ” and find : “ deleteByQuery ” We can
  • 4. but you can only find what is stored in the database
  • 5. Normalise values “ deleteByQuery” 'delete' 'by' 'query' 'deletebyquery'
  • 6. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 7. Normalise values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 8. Analyse values and search terms “ deleteByQuery” “ DELETE QUERY” ' delete ' 'by' ' query ' 'deletebyquery'
  • 9. What is stored in ElasticSearch?
  • 10. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Document:
  • 11. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Fields:
  • 12. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Values:
  • 13. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org" }, tags => [" perl" ,"opinion"], posts => 2, } Field types: # object # string # date # nested object # string # string # array of enums # integer
  • 14. { tweet => "Perl is GREAT!", posted => "2011-08-15", user => { name => "Clinton Gormley", email => "drtech@cpan.org", }, tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened:
  • 15. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Nested objects flattened
  • 16. { tweet => "Perl is GREAT!", posted => "2011-08-15", user.name => "Clinton Gormley", user.email => "drtech@cpan.org", tags => [" perl" ,"opinion"], posts => 2, } Values analyzed into terms
  • 17. { tweet => ['perl','great'], posted => [Date(2011-08-15)], user.name => ['clinton','gormley'], user.email => ['drtech','cpan.org'], tags => [' perl' ,'opinion'], posts => [2], } Values analyzed into terms
  • 18. database table row ⇒ many tables ⇒ many rows ⇒ one schema ⇒ many columns In MySQL
  • 19. index type document ⇒ many types ⇒ many documents ⇒ one mapping ⇒ many fields In ElasticSearch
  • 20. Create index with mappings $es-> create_index ( index => 'twitter', mappings => { tweet => { properties => { title => { type => 'string' }, created => { type => 'date' } } } } );
  • 21. Add a mapping $es-> put_mapping ( index => 'twitter', type => ' user ', mapping => { properties => { name => { type => 'string' }, created => { type => 'date' }, } } );
  • 22. Can add to existing mapping
  • 23. Can add to existing mapping Cannot change mapping for field
  • 24. Core field types { type => 'string', }
  • 25. Core field types { type => 'string', # byte|short|integer|long|double|float # date, ip addr, geolocation # boolean # binary (as base 64) }
  • 26. Core field types { type => 'string', index => ' analyzed ', # 'Foo Bar' ⇒ [ 'foo', 'bar' ] }
  • 27. Core field types { type => 'string', index => ' not_analyzed ', # 'Foo Bar' ⇒ [ 'Foo Bar' ] }
  • 28. Core field types { type => 'string', index => ' no ', # 'Foo Bar' ⇒ [ ] }
  • 29. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', }
  • 30. Core field types { type => 'string', index => 'analyzed', index_ analyzer => 'default', search_ analyzer => 'default', }
  • 31. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, }
  • 32. Core field types { type => 'string', index => 'analyzed', analyzer => 'default', boost => 2, include_in_all => 1 |0 }
  • 33.
  • 36. Stop
  • 37.
  • 41. The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com keyword: The Brown-Cow's Part_No. #A.BC123-456 joe@bloggs.com whitespace: The, Brown-Cow's, Part_No., #A.BC123-456, joe@bloggs.com simple: the, brown, cow, s, part, no, a, bc, joe, bloggs, com standard: brown, cow's, part_no, a.bc123, 456, joe, bloggs.com snowball (English): brown, cow, part_no, a.bc123, 456, joe, bloggs.com
  • 42.
  • 46. NGram
  • 50. Stop
  • 51.
  • 52. KStem
  • 61. Custom Analyzer $c->create_index( index => 'twitter', settings => { analysis => { analyzer => { ascii_html => { type => 'custom', tokenizer => 'standard', filter => [ qw( standard lowercase asciifolding stop ) ], char_filter => ['html_strip'] } } }} );
  • 62. Searching $result = $es->search( index => 'twitter', type => 'tweet', );
  • 63. Searching $result = $es->search( index => ['twitter','facebook'] , type => ['tweet','post'] , );
  • 64. Searching $result = $es->search( # all indices # all types );
  • 65. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, );
  • 66. Searching $result = $es->search( index => 'twitter', type => 'tweet', query b => 'foo' , # b == ElasticSearch::SearchBuilder );
  • 67. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] );
  • 68. Searching $result = $es->search( index => 'twitter', type => 'tweet', query => { text => { _all => 'foo' }}, sort => [{ '_score': 'desc' }] from => 0, size => 10, );
  • 70. Queries vs Filters
  • 71.
  • 72.
  • 73.
  • 75.
  • 77.
  • 80.
  • 83.
  • 87.
  • 90.
  • 93. cacheable Use filters for anything that doesn't affect the relevance score!
  • 94. Query only Query DSL: $es->search( query => { text => { title => 'perl' } } ); SearchBuilder: $es->search( query b => { title => 'perl' } );
  • 95. Filter only Query DSL: $es->search( query => { constant_score => { filter => { term => { tag => 'perl } } } }); SearchBuilder: $es->search( query b => { -filter => { tag => 'perl' } });
  • 96. Query and filter Query DSL: $es->search( query => { filtered => { query => { text => { title => 'perl' } }, filter =>{ term => { tag => 'perl' } } } }); SearchBuilder: $es->search( query b => { title => 'perl', -filter => { tag => 'perl' } });
  • 98. Filters : equality Query DSL: { term => { tags => 'perl' }} { terms => { tags => ['perl','ruby'] }} SearchBuilder: { tags => 'perl' } { tags => ['perl','ruby'] }
  • 99. Filters : range Query DSL: { range => { date => { gte => '2010-11-01', lt => '2010-12-01' }} SearchBuilder: { date => { gte => '2010-11-01', lt => '2011-12-01' }}
  • 100. Filters : range (many values) Query DSL: { numeric_range => { date => { gte => '2010-11-01', lt => '2010-12-01 }} SearchBuilder: { date => { ' >= ' => '2010-11-01', ' < ' => '2011-12-01' }}
  • 101. Filters : and | or | not Query DSL: { and => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ]} { not => { or => [ {term=>{X=>1}}, {term=>{Y=>2}} ] }} SearchBuilder: { X => 1, Y => 2 } [ X => 1, Y => 2 ] { -not => { X => 1, Y => 2 } } # and { -not => [ X => 1, Y => 2 ] } # or
  • 102. Filters : exists | missing Query DSL: { exists => { field => 'title' }} { missing => { field => 'title' }} SearchBuilder: { -exists => 'title' } { -missing => 'title' }
  • 103. Filter example SearchBuilder: { -filter => [ featured => 1, { created_at => { gt => '2011-08-01' }, status => { '!=' => 'pending' }, }, ] }
  • 104. Filter example Query DSL: { constant_score => { filter => { or => [ { term => { featured => 1 }}, { and => [ { not => { term => { status => 'pending' }}, { range => { created_at => { gt => '2011-08-01' }}}, ] } ] } } }
  • 105.
  • 106. nested
  • 108. query
  • 110. prefix
  • 111.
  • 112. type
  • 117.
  • 120.
  • 121. range
  • 122. prefix
  • 123. fuzzy
  • 125. ids
  • 126.
  • 128.
  • 129.
  • 131.
  • 134.
  • 137.
  • 138. range
  • 139. prefix
  • 140. fuzzy
  • 142. ids
  • 143.
  • 145.
  • 146.
  • 148.
  • 153. Text/Analyzed Queries analyzed ⇒ text query using search_analyzer
  • 154. Text-Query Family Query DSL: { text => { title => 'great perl' }} Search Builder: { title => 'great perl' }
  • 155. Text-Query Family Query DSL: { text => { title => { query => 'great perl' }}} Search Builder: { title => { '=' => { query => 'great perl' }}}
  • 156. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , operator => 'and' }}} Search Builder: { title => { '=' => { query => 'great perl', operator => 'and' }}}
  • 157. Text-Query Family Query DSL: { text => { title => { query => 'great perl' , fuzziness => 0.5 }}} Search Builder: { title => { '=' => { query => 'great perl', fuzziness => 0.5 }}}
  • 158. Text-Query Family Query DSL: { text => { title => { query => 'great perl', type => 'phrase' }}} Search Builder: { title => { '==' => { query => 'great perl', }}}
  • 159. Text-Query Family Query DSL: { text => { title => { query => ' great perl ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' great perl ', }}}
  • 160. Text-Query Family Query DSL: { text => { title => { query => ' perl is great ', type => 'phrase' }}} Search Builder: { title => { '==' => { query => ' perl is great ', }}}
  • 161. Text-Query Family Query DSL: { text => { title => { query => ' perl great ', type => 'phrase', slop => 3 }}} Search Builder: { title => { '==' => { query => ' perl great ', slop => 3 }}}
  • 162. Text-Query Family Query DSL: { text => { title => { query => ' perl is gr ', type => ' phrase_prefix ', }}} Search Builder: { title => { '^' => { query => ' perl is gr ', }}}
  • 163. Query string / Field Lucene Query Syntax aware “ perl is great”~5 AND author:clint* -deleted
  • 164. Query string / Field Syntax errors: AND perl is great ” author : clint* -
  • 165. Query string / Field Syntax errors: AND perl is great ” author : clint* - ElasticSearch::QueryParser
  • 166. Combining: Bool Query DSL: { bool => { must => [ { term => { foo => 1}}, ... ], must_not => [ { term => { bar => 1}}, ... ], should => [ { term => { X => 2}}, { term => { Y => 2}},... ], minimum_number_should_match => 1, }}
  • 167. Combining: Bool SearchBuilder: { foo => 1, bar => { '!=' => 1}, -or => [ X => 2, Y => 2], } { -bool => { must => { foo => 1 }, must_not => { bar => 1 }, should => [{ X => 2}, { Y => 2 }], minimum_number_should_match => 1, }}
  • 168. Combining: DisMax Query DSL: { dis_max => { queries => [ { term => { foo => 1}}, { term => { bar => 1}}, ] }} SearchBuilder: { -dis_max => [ { term => { foo => 1}}, { term => { bar => 1}}, ], }
  • 169. Bool: combines scores DisMax: uses highest score from all matching clauses
  • 172. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string” }, }
  • 173. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, }, }
  • 174. Boosting: at index time { properties => { content => { type => “string” }, title => { type => “string”, boost => 2, }, rank => { type => “integer” }, }, _boost => { name => 'rank', null_value => 1.0 }, }
  • 175. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => 'perl' }}, ] }} SearchBuilder: { content => 'perl', title => 'perl' }
  • 176. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl' }} }
  • 177. Boosting: at search time Query DSL: { bool => { should => [ { text => { content => 'perl' }}, { text => { title => { query => 'perl', boost => 2 }}, ] }} SearchBuilder: { content => 'perl', title => { '=' => { query => 'perl', boost=> 2 }} }
  • 178. Boosting: custom_score Query DSL: { custom_score => { query => { text => { title => 'perl' }}, script => “_score * foo /doc['rank'].value”, }} SearchBuilder: { -custom_score => { query => { title => 'perl' }, script => “_score * foo /doc['rank'].value”, }}
  • 179. Query example SearchBuilder: { -or => [ title => { '=' => { query => 'custom score', boost => 2 }}, content => 'custom score', ], -filter => { repo => 'elasticsearch/elasticsearch', created_at => { '>=' => '2011-07-01', '<' => '2011-08-01'}, -or => [ creator_id => 123, assignee_id => 123, ], labels => ['bug','breaking'] } }
  • 180. Query example Query DSL: { query => { filtered => { query => { bool => { should => [ { text => { content => &quot;custom score&quot; } }, { text => { title => { boost => 2, query => &quot;custom score&quot; } } }, ], }, }, filter => { and => [ { or => [ { term => { creator_id => 123 } }, { term => { assignee_id => 123 } }, ]}, { terms => { labels => [&quot;bug&quot;, &quot;breaking&quot;] } }, { term => { repo => &quot;elasticsearch/elasticsearch&quot; } }, { numeric_range => { created_at => { gte => &quot;2011-07-01&quot;, lt => &quot;2011-08-01&quot; }}}, ]}, }}
  • 181.