SlideShare uma empresa Scribd logo
1 de 23
ELASTICSEARCH INTRODUCTION
Kristijan Duvnjak & Mladen MaravićZagreb, 27.03.2015.
Elasticsearch as a search
alternative to a relational database
PART 1
1
What is Elasticsearch?
What is Elasticsearch (ES)?
Document-oriented schema-free "database"
Built on top of Apache Lucene
Real-time search and data analytics
Full-text search
Distributed (horizontal scalability)
High-avalability
REST API
2
"Open Source (Apache 2)
distributed
RESTful
search engine
built on top of Lucene"
ES for relational database users...
3
Oracle Elasticsearch
Database Index
Partition Shard
Table Type
Row Document
Column Field
Schema Mapping
Index - (everything is indexed)
SQL Query DSL
Clustering – single node cluster
Node = running instance of ES
Cluster = 1+ nodes with the same cluster.name
Every cluster has 1 master node
Clients talk to any node in the cluster
1 Cluster can have any number of indexes
4
About indexes & shards
All data is stored inside one or more indexes
Index has one or more shards (change
requires reindexing)
One index is one folder somewhere on disk
Backup an index? Just tar/zip the folder....
5
Each shard is one full instance of Lucene
Each shard can have zero or more replicas
(can be changed at any time)
Index Shard
Clustering – adding a second node
Example above:
3 indexes
Each index has one primary (P) and one replica (R) shard
6
Clustering – adding a third node
More primary shards:
faster indexing
more scale
More replicas:
faster searching
more failover
7
About documents...
Documents are JSON-based
Schema-free, but not necessarily!
If no schema:
ES guesses field type
and indexes it
With schema (or explicit mapping):
Mapping applies to specific document type (type is just a label)
Mapping defines the following for each field:
─ kind (string, number, date...)
─ to index or not
─ to store data or not
8
About documents...
Each document has an ID (auto-generated or manually assigned)
You can force placement of a document into a specific shard – routing!
Versioning is available – optimistic version control !
9
Index details
inverted index
Elasticsearch Server 1.0 (doc 1)
Mastering Elasticsearch (doc 2)
Apache Solr 4 Cookbook (doc 3)
10
Term Count Document
1.0 1 <1>
4 1 <3>
apache 1 <3>
cookbook 1 <3>
elasticsearch 2 <1>,<2>
mastering 1 <2>
server 1 <1>
solr 1 <3>
Indexing example
GET /blog/_search
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "blog",
"_type": "blog_comment",
"_id": "AUzhH9M9HW_GzrF8oLAj",
"_score": 1,
"_source": {
"user_id": 1,
"date": "2015-04-01T13:12:12",
"comment": "What’s so cool about Elasticsearch?"
}
}
]
}
}
11
POST /blog/blog_comment?routing=1
{
"user_id" : 1,
"date" : "2015-04-01T13:12:12",
"comment" : "What’s so cool about Elasticsearch?"
}
GET /blog/_mapping
{
"blog": {
"mappings": {
"blog_comment": {
"properties": {
"comment": {
"type": "string"
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"user_id": {
"type": "long"
}
}
}
}
}
}
Storing data - indexing
data input: REST, Java API, Rivers*
data analysis: tokenizer and one or more filters
types of filters:
lowercase filter – makes all tokens lowercased
synonyms filter – changes one token to another on the basis of synonym rules
language stemming filters - reducing tokens into root or base forms, the stem
different data storing needs
string analyze,not_analyze field configuration
_all in field
memory field data or doc values
segments, segment merging, throttling
routing, indexing with routing
12
We query them!
All the usual stuff (think of WHERE in SQL)
Full text search with support for:
highlighting
stemming
ngrams & edge-ngrams
Aggregations: term facets, date histograms, ranges
Geo search: bounding box, distance,distance ranges, polygons
Percolators (or reverse-search!)
So, we can store documents
and then what?!?
13
Query details
search types (query_then_fetch, query_and_fetch ...)
same type of analysis as indexing
explain plan
sorting,aggregating data with in memory or on disk values
search filters
Boolean
And/Or/Not
filter cache, BitSets
routing, searching with routing
14
PBZ use case
turnovers by account: 600M documents, 200M/year
routing by account number
indexing performance, 30k-40k documents per second
DB performance in seconds, ES performance in ms (3500 queries/sec):
find last 100 turnovers for a given account number: < 50 ms
find last 100 turnovers for a given account number where description contains some words:
<100ms
15
PART 2
16
Cluster architecture
PBZ ES cluster architecture
17
DATA node 1
DATA node 2
Elasticsearch cluster
CLIENT node 1
NETWORK
DISPATCHER
CLIENT node 2
MASTER node 1
MASTER node 2
MASTER node 3
Cluster per datacenter
DATA node 1
DATA node 2Elasticsearch cluster
CLIENT node
NETWORK
DISPATCHER
MASTER node
PBZ ES cluster architecture
18
Cluster per datacenter
Elasticsearch Administration
plugins
Marvel – monitoring console (GC, throttiling, CPU, memory, heap, search/indexing statistics ...)
Sense – REST UI to Elasticsearch
custom plugins (JDBC rivers ...)
security
Apache Web server
Elasticsearch Shield
speeding up queries using warmers
19
PART 3
20
ELK
PART 4
21
Q & A
Mladen Maravić & Kristijan Duvnjak
22

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya BhamidpatiPhilly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
Philly PHP: April '17 Elastic Search Introduction by Aditya Bhamidpati
 
Elasticsearch Introduction
Elasticsearch IntroductionElasticsearch Introduction
Elasticsearch Introduction
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
ElasticSearch Basic Introduction
ElasticSearch Basic IntroductionElasticSearch Basic Introduction
ElasticSearch Basic Introduction
 
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...What I learnt: Elastic search & Kibana : introduction, installtion & configur...
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch for Data Analytics
Elasticsearch for Data AnalyticsElasticsearch for Data Analytics
Elasticsearch for Data Analytics
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Elastic search
Elastic searchElastic search
Elastic search
 

Destaque (6)

Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
 
Elasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English versionElasticsearch - Devoxx France 2012 - English version
Elasticsearch - Devoxx France 2012 - English version
 
Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문Elastic Search (엘라스틱서치) 입문
Elastic Search (엘라스틱서치) 입문
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
 

Semelhante a Elasticsearch as a search alternative to a relational database

Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with rails
Tom Z Zeng
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
Aaron Shilo
 

Semelhante a Elasticsearch as a search alternative to a relational database (20)

Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
Using elasticsearch with rails
Using elasticsearch with railsUsing elasticsearch with rails
Using elasticsearch with rails
 
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analyticsElasticsearch, a distributed search engine with real-time analytics
Elasticsearch, a distributed search engine with real-time analytics
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Wanna search? Piece of cake!
Wanna search? Piece of cake!Wanna search? Piece of cake!
Wanna search? Piece of cake!
 
ELK - Stack - Munich .net UG
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Elasticsearch - basics and beyond
Elasticsearch - basics and beyondElasticsearch - basics and beyond
Elasticsearch - basics and beyond
 
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdfELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
 
Elasticsearch for beginners
Elasticsearch for beginnersElasticsearch for beginners
Elasticsearch for beginners
 
Elastic search
Elastic searchElastic search
Elastic search
 
ElasticSearch Basics
ElasticSearch BasicsElasticSearch Basics
ElasticSearch Basics
 
Elasticsearch and Spark
Elasticsearch and SparkElasticsearch and Spark
Elasticsearch and Spark
 
Elasticsearch - under the hood
Elasticsearch - under the hoodElasticsearch - under the hood
Elasticsearch - under the hood
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for Hadoop
 
Elasticsearch: An Overview
Elasticsearch: An OverviewElasticsearch: An Overview
Elasticsearch: An Overview
 
Elasticsearch Architechture
Elasticsearch ArchitechtureElasticsearch Architechture
Elasticsearch Architechture
 

Último

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Último (20)

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
Abortion Pills In Pretoria ](+27832195400*)[ 🏥 Women's Abortion Clinic In Pre...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

Elasticsearch as a search alternative to a relational database

  • 1. ELASTICSEARCH INTRODUCTION Kristijan Duvnjak & Mladen MaravićZagreb, 27.03.2015. Elasticsearch as a search alternative to a relational database
  • 2. PART 1 1 What is Elasticsearch?
  • 3. What is Elasticsearch (ES)? Document-oriented schema-free "database" Built on top of Apache Lucene Real-time search and data analytics Full-text search Distributed (horizontal scalability) High-avalability REST API 2 "Open Source (Apache 2) distributed RESTful search engine built on top of Lucene"
  • 4. ES for relational database users... 3 Oracle Elasticsearch Database Index Partition Shard Table Type Row Document Column Field Schema Mapping Index - (everything is indexed) SQL Query DSL
  • 5. Clustering – single node cluster Node = running instance of ES Cluster = 1+ nodes with the same cluster.name Every cluster has 1 master node Clients talk to any node in the cluster 1 Cluster can have any number of indexes 4
  • 6. About indexes & shards All data is stored inside one or more indexes Index has one or more shards (change requires reindexing) One index is one folder somewhere on disk Backup an index? Just tar/zip the folder.... 5 Each shard is one full instance of Lucene Each shard can have zero or more replicas (can be changed at any time) Index Shard
  • 7. Clustering – adding a second node Example above: 3 indexes Each index has one primary (P) and one replica (R) shard 6
  • 8. Clustering – adding a third node More primary shards: faster indexing more scale More replicas: faster searching more failover 7
  • 9. About documents... Documents are JSON-based Schema-free, but not necessarily! If no schema: ES guesses field type and indexes it With schema (or explicit mapping): Mapping applies to specific document type (type is just a label) Mapping defines the following for each field: ─ kind (string, number, date...) ─ to index or not ─ to store data or not 8
  • 10. About documents... Each document has an ID (auto-generated or manually assigned) You can force placement of a document into a specific shard – routing! Versioning is available – optimistic version control ! 9
  • 11. Index details inverted index Elasticsearch Server 1.0 (doc 1) Mastering Elasticsearch (doc 2) Apache Solr 4 Cookbook (doc 3) 10 Term Count Document 1.0 1 <1> 4 1 <3> apache 1 <3> cookbook 1 <3> elasticsearch 2 <1>,<2> mastering 1 <2> server 1 <1> solr 1 <3>
  • 12. Indexing example GET /blog/_search { "took": 6, "timed_out": false, "_shards": { "total": 2, "successful": 2, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "blog", "_type": "blog_comment", "_id": "AUzhH9M9HW_GzrF8oLAj", "_score": 1, "_source": { "user_id": 1, "date": "2015-04-01T13:12:12", "comment": "What’s so cool about Elasticsearch?" } } ] } } 11 POST /blog/blog_comment?routing=1 { "user_id" : 1, "date" : "2015-04-01T13:12:12", "comment" : "What’s so cool about Elasticsearch?" } GET /blog/_mapping { "blog": { "mappings": { "blog_comment": { "properties": { "comment": { "type": "string" }, "date": { "type": "date", "format": "dateOptionalTime" }, "user_id": { "type": "long" } } } } } }
  • 13. Storing data - indexing data input: REST, Java API, Rivers* data analysis: tokenizer and one or more filters types of filters: lowercase filter – makes all tokens lowercased synonyms filter – changes one token to another on the basis of synonym rules language stemming filters - reducing tokens into root or base forms, the stem different data storing needs string analyze,not_analyze field configuration _all in field memory field data or doc values segments, segment merging, throttling routing, indexing with routing 12
  • 14. We query them! All the usual stuff (think of WHERE in SQL) Full text search with support for: highlighting stemming ngrams & edge-ngrams Aggregations: term facets, date histograms, ranges Geo search: bounding box, distance,distance ranges, polygons Percolators (or reverse-search!) So, we can store documents and then what?!? 13
  • 15. Query details search types (query_then_fetch, query_and_fetch ...) same type of analysis as indexing explain plan sorting,aggregating data with in memory or on disk values search filters Boolean And/Or/Not filter cache, BitSets routing, searching with routing 14
  • 16. PBZ use case turnovers by account: 600M documents, 200M/year routing by account number indexing performance, 30k-40k documents per second DB performance in seconds, ES performance in ms (3500 queries/sec): find last 100 turnovers for a given account number: < 50 ms find last 100 turnovers for a given account number where description contains some words: <100ms 15
  • 18. PBZ ES cluster architecture 17 DATA node 1 DATA node 2 Elasticsearch cluster CLIENT node 1 NETWORK DISPATCHER CLIENT node 2 MASTER node 1 MASTER node 2 MASTER node 3 Cluster per datacenter
  • 19. DATA node 1 DATA node 2Elasticsearch cluster CLIENT node NETWORK DISPATCHER MASTER node PBZ ES cluster architecture 18 Cluster per datacenter
  • 20. Elasticsearch Administration plugins Marvel – monitoring console (GC, throttiling, CPU, memory, heap, search/indexing statistics ...) Sense – REST UI to Elasticsearch custom plugins (JDBC rivers ...) security Apache Web server Elasticsearch Shield speeding up queries using warmers 19
  • 23. Mladen Maravić & Kristijan Duvnjak 22