2. About me
● Over a decade of experience in building software
● Lead developer/Architect at PowerToFly
3. Our journey with ElasticSearch
2014: launched with Postgres Full text search
2015: Faceted Search with ES v1.4
2016: Log monitoring system with ELK 2.3
2017: Analytics pipeline with ELK 5.5
4. Search for a search engine
Postgres
v9.3
Sphinx
v2.1
Solr
v4.x
ElasticSearch
v1.4
Full text search ✓ ✓ ✓ ✓
Support for facets ❌ ✓ ✓ ✓
Cluster ready ❌ ❌ Limited ✓
Search in PDFs ❌ ❌ ✓ ✓
REST APIs ❌ ❌ ❌ ✓
Nested docs,
Parent-Child relations
❌ NA Limited ✓
Powerful and Flexible
Query DSL
❌ NA ❌ ✓
5. distributed, multitenant-capable, full-text search engine.
● Built upon battle tested Lucene
● Powerful and flexible Query DSL
● Powerful Aggregations
● REST APIs for everything
● Ease with nested documents and parent-child relationships
● Suitable eco system for data pipelines
The goodness of ElasticSearch
6.
7. What sits where ?
Internet
Search
Service
ES
cluster
Periodic
Indexing
job
Postgres
DB
Primary datastore
for
core data
jobs, candidates
data
9. Log monitoring: From a third-party solution to ELK based
AWS
S3
ElasticSearch cluster
web & worker nodes
with filebeat
logstash
Dashboards
on
Kibana
Daily indices
logs
17. ● Deep pagination is costly with search API
Use scroll API where applicable
18. ● POST /unused_index/_close
● POST /index_with_more_segments/_forcemerge
● Use _rollover API to let hot/recent indexes use best servers
Manage your indexes
19. ● Disable indexing, storing, norms, _source when you don’t need
● Use smallest numeric data or make it keyword
● Optimize number of primary shards
● Use bulk requests, optimize their size
Index performance tuning
20. Summary
● Elastic stack is growing and improving - see if it fits your needs
● Defaults are good only to start - know what they are and tune them
● Different indexes for different data
● Understand your needs and model your documents well