SlideShare a Scribd company logo
1 of 19
Download to read offline
elasticsearch

ADVANCED FEATURES IN PRACTICE

          @JSUCHAL
         #RUBYSLAVA
elasticwhat?

 based on Apache Lucene
 REST API
 Data & API in JSON
 Schema-free
 Real time
 Distributed
 Advanced functionality
Quickstart

1. Download & extract from
   http://www.elasticsearch.org/download/
2. $ bin/elasticsearch –f
3. There is no step 3.
Quickstart - index

$    curl -XPOST 'http://localhost:9200/rubyslava/talks/1' -d '{
      "title" : "elasticsearch - advanced features in practice",
      "presenter" : "jsuchal",
      "presented_at" : "2011-09-22T19:00:00",
      "message" : "hopefully clear",
      "tags" : ["elasticsearch", "rocks"]
}'

=> {"ok":true,"_index":"rubyslava","_type":"talks","_id":"1","_version":1}
Quickstart - search
$ curl -XPOST 'http://localhost:9200/rubyslava/talks/_search?q=jsuchal&pretty‘

=>
{
     "took" : 2,
     "timed_out" : false,
     "_shards" : {
        "total" : 5,
        "successful" : 5,
        "failed" : 0
     },
     "hits" : {
        "total" : 1,
        "max_score" : 0.054244425,
        "hits" : [ {
          "_index" : "rubyslava",
          "_type" : "talks",
          "_id" : "1",
          "_score" : 0.054244425, "_source" : {
            "title" : "elasticsearch - advanced features in practice",
            "presenter" : "jsuchal",
            "presented_at" : "2011-09-22T19:00:00",
            "message" : "hopefully clear",
            "tags" : ["elasticsearch", "rocks"]
          }
        } ]
     }
}
Advanced features

 Search
   analyzer, stemming, ngrams, ascii folding & custom analyzers

   boosting, fragment highlighting, fuzzy search

   …

 Facets
 Percolate
 Scroll
 and more…
The Case Study

 Find suspicious government contracts
   using heuristics
       IT contract where price > 1M euro
       Supplier company age < 3 months

     using crowdsourcing
 Data
   Central government contract repositories

   www.crz.gov.sk, zmluvy.egov.sk

   ~70K contracts in 8 months

   100+ GB pdf/doc/scan
The Solution




Faceted search
The Solution

 Faceted search
   Search
        e.g. Find all contracts by Orange Slovakia
    Analyze
      e.g. Which department has most contracts
       with Orange Slovakia?
      e.g. What is the contract price distribution
       for Orange Slovakia?
     …

    Define penalty heuristics
Facets

 Types
   term, range, histogram, statistical, geo distance



$ curl -XPOST 'http://localhost:9200/rubyslava/talks/_search?pretty' -d '{
    "query" : {
        "match_all" : { }
    },
    "facets" : {
        "tags_facet" : {
            "terms" : {
                 "field" : "tags",
                 "size" : 10
            }
        }
    }
}'
Facets - results

{
    "took" : 2,
    …
    "hits" : { … },
    "facets" : {
      "tags_facet" : {
        "_type" : "terms",
        "missing" : 0,
        "total" : 2,
        "other" : 0,
        "terms" : [ {
          "term" : "rocks",
          "count" : 1
        }, {
          "term" : "elasticsearch",
          "count" : 1
        } ]
      }
    }
}
Facets - advanced

 Problem
   Generate options for facets with some
    selected restrictions
 Solution
   global facet

   facet_filter
  {
      "facets" : {
          "<FACET NAME>" : {
              "<FACET TYPE>" : {
                  ...
              },
              "global" : true,
              "facet_filter" : {
                  "term" : { “supplier.untouched" : “Orange Slovakia, a.s."}
              }
          }
      }
  }
Percolate

 Problem
   New contract/document added, which heuristics does it
    match?
 Solution
  1. Save heuristics/searches in percolator index
  2. Percolate new documents
Percolate

$ curl -XPUT 'localhost:9200/_percolator/rubyslava/heuristic-1' -d '{
    "query" : {
        "term" : {
            "tags" : "rubyslava"
        }
    }
}'

$ curl -XPOST 'http://localhost:9200/rubyslava/talks/_percolate' -d '{
   "doc" : {
     "tags" : ["rubyslava", "rocks", "too"]
   }
}‘

=> {"ok":true,"matches":["heuristic-1"]}
Scroll

 Problem
   New heuristic added and matches many (1K+) documents

   Add heuristic to all matching documents

   + Offset performance problem known in RDBMS

 Solution
   Use async background job

   Scroll through results (a.k.a. cursor)
Scroll

$ curl -XGET 'http://localhost:9200/rubyslava/talks/_search?scroll=5m&pretty' -d '{
    "query": {
        "match_all" : {}
    }
}
‘

=>
{
  "_scroll_id" : "cXVlcnlUaGVuRmV0Y2g7NTs5MzpQYmlzX3I2VFJRS0dRSEhGX2t6TTRROzk0OlBiaXNfcjZUUlFLR1
FISEZfa3pNNFE7OTU6UGJpc19yNlRSUUtHUUhIRl9rek00UTs5MjpQYmlzX3I2VFJRS0dRSEhGX2t6TTRROzkxOlBiaXNfcj
ZUUlFLR1FISEZfa3pNNFE7MDs=",
  "took" : 2,
  …
  "hits" : […],
}

$ curl -XGET 'http://localhost:9200/_search/scroll?scroll=5m&scroll_id=cXVlcnlUaGVuRmV0Y2g…'

=> more results & repeat
Ruby Scroll API

 Mimics find_each in ActiveRecord

 def find_each(query, &block)
   scroll_id = nil
   processed = 0
   begin
     unless scroll_id
       result = initiate_scroll(query)
       scroll_id = result.scroll_id
     else
       result = scroll(scroll_id)
     end
     result.hits.each do |document|
       yield document
     end
     processed += result.hits.size
   end while processed < result.hits.total
 end
Tutorials & Guides

 http://www.slideshare.net/clintongormley/cool-
  bonsai-cool-an-introduction-to-elasticsearch
 http://www.slideshare.net/clintongormley/terms-
  of-endearment-the-elasticsearch-query-dsl-
  explained
 http://www.elasticsearch.org/guide/

More Related Content

What's hot

Terms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedclintongormley
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETTomas Jansson
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"George Stathis
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with ElasticsearchAleksander Stensby
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 MinutesKarel Minarik
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Karel Minarik
 
Elastify you application: from SQL to NoSQL in less than one hour!
Elastify you application: from SQL to NoSQL in less than one hour!Elastify you application: from SQL to NoSQL in less than one hour!
Elastify you application: from SQL to NoSQL in less than one hour!David Pilato
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014Roy Russo
 
Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Karel Minarik
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UNLucidworks
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013Roy Russo
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchJason Austin
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solrmacrochen
 
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearchmartijnvg
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring DataEric Bottard
 

What's hot (20)

Terms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explainedTerms of endearment - the ElasticSearch Query DSL explained
Terms of endearment - the ElasticSearch Query DSL explained
 
Getting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NETGetting started with Elasticsearch and .NET
Getting started with Elasticsearch and .NET
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with Elasticsearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Introduction to solr
Introduction to solrIntroduction to solr
Introduction to solr
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
 
Elastify you application: from SQL to NoSQL in less than one hour!
Elastify you application: from SQL to NoSQL in less than one hour!Elastify you application: from SQL to NoSQL in less than one hour!
Elastify you application: from SQL to NoSQL in less than one hour!
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
 
Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)Elasticsearch (Rubyshift 2013)
Elasticsearch (Rubyshift 2013)
 
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UNSolr vs. Elasticsearch,  Case by Case: Presented by Alexandre Rafalovitch, UN
Solr vs. Elasticsearch, Case by Case: Presented by Alexandre Rafalovitch, UN
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearch
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 

Viewers also liked

Scaling Elasticsearch at Synthesio
Scaling Elasticsearch at SynthesioScaling Elasticsearch at Synthesio
Scaling Elasticsearch at SynthesioFred de Villamil
 
Use Cases for Elastic Search Percolator
Use Cases for Elastic Search PercolatorUse Cases for Elastic Search Percolator
Use Cases for Elastic Search PercolatorMaxim Shelest
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic IntroductionMayur Rathod
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in ElasticsearchImpetus Technologies
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleBharvi Dixit
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesItamar
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityStéphane Gamard
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseKristijan Duvnjak
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemAvleen Vig
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsAlaa Elhadba
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 

Viewers also liked (17)

Scaling Elasticsearch at Synthesio
Scaling Elasticsearch at SynthesioScaling Elasticsearch at Synthesio
Scaling Elasticsearch at Synthesio
 
Use Cases for Elastic Search Percolator
Use Cases for Elastic Search PercolatorUse Cases for Elastic Search Percolator
Use Cases for Elastic Search Percolator
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
Impetus White Paper- Handling Data Corruption in Elasticsearch
Impetus White Paper- Handling  Data Corruption  in ElasticsearchImpetus White Paper- Handling  Data Corruption  in Elasticsearch
Impetus White Paper- Handling Data Corruption in Elasticsearch
 
The Commando Devops
The Commando DevopsThe Commando Devops
The Commando Devops
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
 
Practical Elasticsearch - real world use cases
Practical Elasticsearch - real world use casesPractical Elasticsearch - real world use cases
Practical Elasticsearch - real world use cases
 
From Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalabilityFrom Lucene to Elasticsearch, a short explanation of horizontal scalability
From Lucene to Elasticsearch, a short explanation of horizontal scalability
 
Elasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational databaseElasticsearch as a search alternative to a relational database
Elasticsearch as a search alternative to a relational database
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
ELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log systemELK: Moose-ively scaling your log system
ELK: Moose-ively scaling your log system
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on DockerRunning High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
 

Similar to elasticsearch - advanced features in practice

Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch MeetupLoïc Bertron
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用LearningTech
 
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018Codemotion
 
Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"Fwdays
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Codemotion
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Codemotion
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in BerlinToshiaki Katayama
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcasesAndrii Gakhov
 
DRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDrupalCamp Kyiv
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchFlorian Hopf
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...confluent
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro outputTom Chen
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...MongoDB
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Microsoft
 
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)Pat Patterson
 

Similar to elasticsearch - advanced features in practice (20)

Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Montreal Elasticsearch Meetup
Montreal Elasticsearch MeetupMontreal Elasticsearch Meetup
Montreal Elasticsearch Meetup
 
Peggy elasticsearch應用
Peggy elasticsearch應用Peggy elasticsearch應用
Peggy elasticsearch應用
 
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
Philipp Krenn | Make Your Data FABulous | Codemotion Madrid 2018
 
Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"Philipp Krenn "Make Your Data FABulous"
Philipp Krenn "Make Your Data FABulous"
 
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
Full-Text Search Explained - Philipp Krenn - Codemotion Rome 2017
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlin
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
ELK - What's new and showcases
ELK - What's new and showcasesELK - What's new and showcases
ELK - What's new and showcases
 
DRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCHDRUPAL AND ELASTICSEARCH
DRUPAL AND ELASTICSEARCH
 
Anwendungsfaelle für Elasticsearch
Anwendungsfaelle für ElasticsearchAnwendungsfaelle für Elasticsearch
Anwendungsfaelle für Elasticsearch
 
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
Closing the Loop in Extended Reality with Kafka Streams and Machine Learning ...
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y g...
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
 
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
OData: Universal Data Solvent or Clunky Enterprise Goo? (GlueCon 2015)
 

More from Jano Suchal

Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?Jano Suchal
 
Improving code quality
Improving code qualityImproving code quality
Improving code qualityJano Suchal
 
Beyond search queries
Beyond search queriesBeyond search queries
Beyond search queriesJano Suchal
 
Rank all the things!
Rank all the things!Rank all the things!
Rank all the things!Jano Suchal
 
Rank all the (geo) things!
Rank all the (geo) things!Rank all the (geo) things!
Rank all the (geo) things!Jano Suchal
 
Ako si vybrať programovácí jazyk alebo framework?
Ako si vybrať programovácí jazyk alebo framework?Ako si vybrať programovácí jazyk alebo framework?
Ako si vybrať programovácí jazyk alebo framework?Jano Suchal
 
Bonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet WorkshopBonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet WorkshopJano Suchal
 
Peter Mihalik: Puppet
Peter Mihalik: PuppetPeter Mihalik: Puppet
Peter Mihalik: PuppetJano Suchal
 
Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3Jano Suchal
 
Ako si vybrať programovací jazyk a framework?
Ako si vybrať programovací jazyk a framework?Ako si vybrať programovací jazyk a framework?
Ako si vybrať programovací jazyk a framework?Jano Suchal
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practiceJano Suchal
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringJano Suchal
 
Miroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázyMiroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázyJano Suchal
 
Vojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenostiVojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenostiJano Suchal
 
Profiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applicationsProfiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applicationsJano Suchal
 
Aký programovací jazyk a framework si vybrať a prečo?
Aký programovací jazyk a framework si vybrať a prečo?Aký programovací jazyk a framework si vybrať a prečo?
Aký programovací jazyk a framework si vybrať a prečo?Jano Suchal
 
Petr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.czPetr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.czJano Suchal
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1Jano Suchal
 

More from Jano Suchal (20)

Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?Slovensko.Digital: Čo ďalej?
Slovensko.Digital: Čo ďalej?
 
Datanest 3.0
Datanest 3.0Datanest 3.0
Datanest 3.0
 
Improving code quality
Improving code qualityImproving code quality
Improving code quality
 
Beyond search queries
Beyond search queriesBeyond search queries
Beyond search queries
 
Rank all the things!
Rank all the things!Rank all the things!
Rank all the things!
 
Rank all the (geo) things!
Rank all the (geo) things!Rank all the (geo) things!
Rank all the (geo) things!
 
Ako si vybrať programovácí jazyk alebo framework?
Ako si vybrať programovácí jazyk alebo framework?Ako si vybrať programovácí jazyk alebo framework?
Ako si vybrať programovácí jazyk alebo framework?
 
Bonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet WorkshopBonetics: Mastering Puppet Workshop
Bonetics: Mastering Puppet Workshop
 
Peter Mihalik: Puppet
Peter Mihalik: PuppetPeter Mihalik: Puppet
Peter Mihalik: Puppet
 
Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3Tomáš Čorej: Configuration management & CFEngine3
Tomáš Čorej: Configuration management & CFEngine3
 
Ako si vybrať programovací jazyk a framework?
Ako si vybrať programovací jazyk a framework?Ako si vybrať programovací jazyk a framework?
Ako si vybrať programovací jazyk a framework?
 
SQL: Query optimization in practice
SQL: Query optimization in practiceSQL: Query optimization in practice
SQL: Query optimization in practice
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
 
Miroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázyMiroslav Šimulčík: Temporálne databázy
Miroslav Šimulčík: Temporálne databázy
 
Vojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenostiVojtech Rinik: Internship v USA - moje skúsenosti
Vojtech Rinik: Internship v USA - moje skúsenosti
 
Profiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applicationsProfiling and monitoring ruby & rails applications
Profiling and monitoring ruby & rails applications
 
Aký programovací jazyk a framework si vybrať a prečo?
Aký programovací jazyk a framework si vybrať a prečo?Aký programovací jazyk a framework si vybrať a prečo?
Aký programovací jazyk a framework si vybrať a prečo?
 
Čo po GAMČI?
Čo po GAMČI?Čo po GAMČI?
Čo po GAMČI?
 
Petr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.czPetr Joachim: Redis na Super.cz
Petr Joachim: Redis na Super.cz
 
Metaprogramovanie #1
Metaprogramovanie #1Metaprogramovanie #1
Metaprogramovanie #1
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 

elasticsearch - advanced features in practice

  • 1. elasticsearch ADVANCED FEATURES IN PRACTICE @JSUCHAL #RUBYSLAVA
  • 2. elasticwhat?  based on Apache Lucene  REST API  Data & API in JSON  Schema-free  Real time  Distributed  Advanced functionality
  • 3. Quickstart 1. Download & extract from http://www.elasticsearch.org/download/ 2. $ bin/elasticsearch –f 3. There is no step 3.
  • 4. Quickstart - index $ curl -XPOST 'http://localhost:9200/rubyslava/talks/1' -d '{ "title" : "elasticsearch - advanced features in practice", "presenter" : "jsuchal", "presented_at" : "2011-09-22T19:00:00", "message" : "hopefully clear", "tags" : ["elasticsearch", "rocks"] }' => {"ok":true,"_index":"rubyslava","_type":"talks","_id":"1","_version":1}
  • 5. Quickstart - search $ curl -XPOST 'http://localhost:9200/rubyslava/talks/_search?q=jsuchal&pretty‘ => { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.054244425, "hits" : [ { "_index" : "rubyslava", "_type" : "talks", "_id" : "1", "_score" : 0.054244425, "_source" : { "title" : "elasticsearch - advanced features in practice", "presenter" : "jsuchal", "presented_at" : "2011-09-22T19:00:00", "message" : "hopefully clear", "tags" : ["elasticsearch", "rocks"] } } ] } }
  • 6. Advanced features  Search  analyzer, stemming, ngrams, ascii folding & custom analyzers  boosting, fragment highlighting, fuzzy search  …  Facets  Percolate  Scroll  and more…
  • 7. The Case Study  Find suspicious government contracts  using heuristics  IT contract where price > 1M euro  Supplier company age < 3 months  using crowdsourcing  Data  Central government contract repositories  www.crz.gov.sk, zmluvy.egov.sk  ~70K contracts in 8 months  100+ GB pdf/doc/scan
  • 9.
  • 10. The Solution  Faceted search  Search  e.g. Find all contracts by Orange Slovakia  Analyze  e.g. Which department has most contracts with Orange Slovakia?  e.g. What is the contract price distribution for Orange Slovakia? …  Define penalty heuristics
  • 11. Facets  Types  term, range, histogram, statistical, geo distance $ curl -XPOST 'http://localhost:9200/rubyslava/talks/_search?pretty' -d '{ "query" : { "match_all" : { } }, "facets" : { "tags_facet" : { "terms" : { "field" : "tags", "size" : 10 } } } }'
  • 12. Facets - results { "took" : 2, … "hits" : { … }, "facets" : { "tags_facet" : { "_type" : "terms", "missing" : 0, "total" : 2, "other" : 0, "terms" : [ { "term" : "rocks", "count" : 1 }, { "term" : "elasticsearch", "count" : 1 } ] } } }
  • 13. Facets - advanced  Problem  Generate options for facets with some selected restrictions  Solution  global facet  facet_filter { "facets" : { "<FACET NAME>" : { "<FACET TYPE>" : { ... }, "global" : true, "facet_filter" : { "term" : { “supplier.untouched" : “Orange Slovakia, a.s."} } } } }
  • 14. Percolate  Problem  New contract/document added, which heuristics does it match?  Solution 1. Save heuristics/searches in percolator index 2. Percolate new documents
  • 15. Percolate $ curl -XPUT 'localhost:9200/_percolator/rubyslava/heuristic-1' -d '{ "query" : { "term" : { "tags" : "rubyslava" } } }' $ curl -XPOST 'http://localhost:9200/rubyslava/talks/_percolate' -d '{ "doc" : { "tags" : ["rubyslava", "rocks", "too"] } }‘ => {"ok":true,"matches":["heuristic-1"]}
  • 16. Scroll  Problem  New heuristic added and matches many (1K+) documents  Add heuristic to all matching documents  + Offset performance problem known in RDBMS  Solution  Use async background job  Scroll through results (a.k.a. cursor)
  • 17. Scroll $ curl -XGET 'http://localhost:9200/rubyslava/talks/_search?scroll=5m&pretty' -d '{ "query": { "match_all" : {} } } ‘ => { "_scroll_id" : "cXVlcnlUaGVuRmV0Y2g7NTs5MzpQYmlzX3I2VFJRS0dRSEhGX2t6TTRROzk0OlBiaXNfcjZUUlFLR1 FISEZfa3pNNFE7OTU6UGJpc19yNlRSUUtHUUhIRl9rek00UTs5MjpQYmlzX3I2VFJRS0dRSEhGX2t6TTRROzkxOlBiaXNfcj ZUUlFLR1FISEZfa3pNNFE7MDs=", "took" : 2, … "hits" : […], } $ curl -XGET 'http://localhost:9200/_search/scroll?scroll=5m&scroll_id=cXVlcnlUaGVuRmV0Y2g…' => more results & repeat
  • 18. Ruby Scroll API  Mimics find_each in ActiveRecord def find_each(query, &block) scroll_id = nil processed = 0 begin unless scroll_id result = initiate_scroll(query) scroll_id = result.scroll_id else result = scroll(scroll_id) end result.hits.each do |document| yield document end processed += result.hits.size end while processed < result.hits.total end
  • 19. Tutorials & Guides  http://www.slideshare.net/clintongormley/cool- bonsai-cool-an-introduction-to-elasticsearch  http://www.slideshare.net/clintongormley/terms- of-endearment-the-elasticsearch-query-dsl- explained  http://www.elasticsearch.org/guide/