SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Growing with ElasticSearch
Devi A S L @ RootConf
11th
May, 2018
About me
● Over a decade of experience in building software
● Lead developer/Architect at PowerToFly
Our journey with ElasticSearch
2014: launched with Postgres Full text search
2015: Faceted Search with ES v1.4
2016: Log monitoring system with ELK 2.3
2017: Analytics pipeline with ELK 5.5
Search for a search engine
Postgres
v9.3
Sphinx
v2.1
Solr
v4.x
ElasticSearch
v1.4
Full text search ✓ ✓ ✓ ✓
Support for facets ❌ ✓ ✓ ✓
Cluster ready ❌ ❌ Limited ✓
Search in PDFs ❌ ❌ ✓ ✓
REST APIs ❌ ❌ ❌ ✓
Nested docs,
Parent-Child relations
❌ NA Limited ✓
Powerful and Flexible
Query DSL
❌ NA ❌ ✓
distributed, multitenant-capable, full-text search engine.
● Built upon battle tested Lucene
● Powerful and flexible Query DSL
● Powerful Aggregations
● REST APIs for everything
● Ease with nested documents and parent-child relationships
● Suitable eco system for data pipelines
The goodness of ElasticSearch
What sits where ?
Internet
Search
Service
ES
cluster
Periodic
Indexing
job
Postgres
DB
Primary datastore
for
core data
jobs, candidates
data
Log monitoring with ELK
Log monitoring: From a third-party solution to ELK based
AWS
S3
ElasticSearch cluster
web & worker nodes
with filebeat
logstash
Dashboards
on
Kibana
Daily indices
logs
Analytics pipeline with ELK stack
Recommendation
engine
Web Application
ElasticSearch cluster
web nodes
with filebeat
logstash
User activity
Kibana
Dashboards
Daily indices
Handling growth
● enable slow query log, customizable per index
Search performance tuning
● Avoid nested documents, if you can
Document modelling
● Deep pagination is costly with search API
Use scroll API where applicable
● POST /unused_index/_close
● POST /index_with_more_segments/_forcemerge
● Use _rollover API to let hot/recent indexes use best servers
Manage your indexes
● Disable indexing, storing, norms, _source when you don’t need
● Use smallest numeric data or make it keyword
● Optimize number of primary shards
● Use bulk requests, optimize their size
Index performance tuning
Summary
● Elastic stack is growing and improving - see if it fits your needs
● Defaults are good only to start - know what they are and tune them
● Different indexes for different data
● Understand your needs and model your documents well
Thank You!
@asldevi

Mais conteúdo relacionado

Mais procurados

Online Model Updating with Spark Streaming
Online Model Updating with Spark StreamingOnline Model Updating with Spark Streaming
Online Model Updating with Spark StreamingKeira Zhou
 
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...Databricks
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Spark Summit
 
Microsoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownMicrosoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownLynn Langit
 
Mindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesMindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesrobin_sy
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Spark Summit
 
Spline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewSpline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewVaclav Kosar
 
Logs, metrics and real time data analytics
Logs, metrics and real time data analyticsLogs, metrics and real time data analytics
Logs, metrics and real time data analyticsEwere Diagboya
 
The IoT and big data
The IoT and big dataThe IoT and big data
The IoT and big dataGal Ben-Haim
 
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...MongoDB
 
KD-2013-Optimizing-Document-Search-using-Lucene
KD-2013-Optimizing-Document-Search-using-LuceneKD-2013-Optimizing-Document-Search-using-Lucene
KD-2013-Optimizing-Document-Search-using-LuceneHarshakumar Ummerpillai
 
Designing Data-Intensive Applications
Designing Data-Intensive ApplicationsDesigning Data-Intensive Applications
Designing Data-Intensive ApplicationsOleg Mürk
 
Presto for apps deck varada prestoconf
Presto for apps deck varada prestoconfPresto for apps deck varada prestoconf
Presto for apps deck varada prestoconfOri Reshef
 
Finding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryFinding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryLynn Langit
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4Michael Kehoe
 
Visualizing large datasets with elasticsearch and kibana
Visualizing large datasets with elasticsearch and kibanaVisualizing large datasets with elasticsearch and kibana
Visualizing large datasets with elasticsearch and kibanaDan Fey
 
Search Engine Working Technology
Search Engine Working TechnologySearch Engine Working Technology
Search Engine Working TechnologyVidco Digital
 
Azure Functions & Serverless Computing
Azure Functions & Serverless ComputingAzure Functions & Serverless Computing
Azure Functions & Serverless ComputingAbhimanyu Singhal
 

Mais procurados (20)

Online Model Updating with Spark Streaming
Online Model Updating with Spark StreamingOnline Model Updating with Spark Streaming
Online Model Updating with Spark Streaming
 
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...
Moving eBay’s Data Warehouse Over to Apache Spark – Spark as Core ETL Platfor...
 
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
Virtualizing Analytics with Apache Spark: Keynote by Arsalan Tavakoli
 
Microsoft Machine Learning Smackdown
Microsoft Machine Learning SmackdownMicrosoft Machine Learning Smackdown
Microsoft Machine Learning Smackdown
 
Graphql
GraphqlGraphql
Graphql
 
Mindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenesMindtalk Tech - Behind the scenes
Mindtalk Tech - Behind the scenes
 
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
Unlocking Value in Device Data Using Spark: Spark Summit East talk by John La...
 
Spline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture OverviewSpline 2 - Vision and Architecture Overview
Spline 2 - Vision and Architecture Overview
 
Logs, metrics and real time data analytics
Logs, metrics and real time data analyticsLogs, metrics and real time data analytics
Logs, metrics and real time data analytics
 
The IoT and big data
The IoT and big dataThe IoT and big data
The IoT and big data
 
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
MongoDB .local Houston 2019: Building an IoT Streaming Analytics Platform to ...
 
DevOps, Yet Another IT Revolution
DevOps, Yet Another IT RevolutionDevOps, Yet Another IT Revolution
DevOps, Yet Another IT Revolution
 
KD-2013-Optimizing-Document-Search-using-Lucene
KD-2013-Optimizing-Document-Search-using-LuceneKD-2013-Optimizing-Document-Search-using-Lucene
KD-2013-Optimizing-Document-Search-using-Lucene
 
Designing Data-Intensive Applications
Designing Data-Intensive ApplicationsDesigning Data-Intensive Applications
Designing Data-Intensive Applications
 
Presto for apps deck varada prestoconf
Presto for apps deck varada prestoconfPresto for apps deck varada prestoconf
Presto for apps deck varada prestoconf
 
Finding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power QueryFinding new Customers using D&B and Excel Power Query
Finding new Customers using D&B and Excel Power Query
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
Visualizing large datasets with elasticsearch and kibana
Visualizing large datasets with elasticsearch and kibanaVisualizing large datasets with elasticsearch and kibana
Visualizing large datasets with elasticsearch and kibana
 
Search Engine Working Technology
Search Engine Working TechnologySearch Engine Working Technology
Search Engine Working Technology
 
Azure Functions & Serverless Computing
Azure Functions & Serverless ComputingAzure Functions & Serverless Computing
Azure Functions & Serverless Computing
 

Semelhante a Growing with elastic search

Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearchpmanvi
 
Meetup070416 Presentations
Meetup070416 PresentationsMeetup070416 Presentations
Meetup070416 PresentationsAna Rebelo
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Vinay Kumar
 
Visualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaVisualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaObjectRocket
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with ElasticsearchAlibaba Cloud
 
Isolating Streaming Ingest and Queries Using RocksDB
Isolating Streaming Ingest and Queries Using RocksDBIsolating Streaming Ingest and Queries Using RocksDB
Isolating Streaming Ingest and Queries Using RocksDBHostedbyConfluent
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleBharvi Dixit
 
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...Data Con LA
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraMikko Huilaja
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSpark Summit
 
AWS Big Data in everyday use at Yle
AWS Big Data in everyday use at YleAWS Big Data in everyday use at Yle
AWS Big Data in everyday use at YleRolf Koski
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)Amazon Web Services Korea
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Zhenxiao Luo
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingInexture Solutions
 
Deep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch ServiceDeep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch ServiceAmazon Web Services
 

Semelhante a Growing with elastic search (20)

Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Prashant_Agrawal_CV
Prashant_Agrawal_CVPrashant_Agrawal_CV
Prashant_Agrawal_CV
 
Meetup070416 Presentations
Meetup070416 PresentationsMeetup070416 Presentations
Meetup070416 Presentations
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Visualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaVisualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and Kibana
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with Elasticsearch
 
Isolating Streaming Ingest and Queries Using RocksDB
Isolating Streaming Ingest and Queries Using RocksDBIsolating Streaming Ingest and Queries Using RocksDB
Isolating Streaming Ingest and Queries Using RocksDB
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
 
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
Data Con LA 2022 - Pre- Recorded - OpenSearch: Everything You Need to Know Ab...
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case Evira
 
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan SharmaSparking up Data Engineering: Spark Summit East talk by Rohan Sharma
Sparking up Data Engineering: Spark Summit East talk by Rohan Sharma
 
AWS Big Data in everyday use at Yle
AWS Big Data in everyday use at YleAWS Big Data in everyday use at Yle
AWS Big Data in everyday use at Yle
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019Real time analytics on deep learning @ strata data 2019
Real time analytics on deep learning @ strata data 2019
 
Explore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth UsingExplore Elasticsearch and Why It’s Worth Using
Explore Elasticsearch and Why It’s Worth Using
 
Deep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch ServiceDeep Dive on Log Analytics with Elasticsearch Service
Deep Dive on Log Analytics with Elasticsearch Service
 
Apache Solr vs Oracle Endeca
Apache Solr vs Oracle EndecaApache Solr vs Oracle Endeca
Apache Solr vs Oracle Endeca
 

Último

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Último (20)

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Growing with elastic search

  • 1. Growing with ElasticSearch Devi A S L @ RootConf 11th May, 2018
  • 2. About me ● Over a decade of experience in building software ● Lead developer/Architect at PowerToFly
  • 3. Our journey with ElasticSearch 2014: launched with Postgres Full text search 2015: Faceted Search with ES v1.4 2016: Log monitoring system with ELK 2.3 2017: Analytics pipeline with ELK 5.5
  • 4. Search for a search engine Postgres v9.3 Sphinx v2.1 Solr v4.x ElasticSearch v1.4 Full text search ✓ ✓ ✓ ✓ Support for facets ❌ ✓ ✓ ✓ Cluster ready ❌ ❌ Limited ✓ Search in PDFs ❌ ❌ ✓ ✓ REST APIs ❌ ❌ ❌ ✓ Nested docs, Parent-Child relations ❌ NA Limited ✓ Powerful and Flexible Query DSL ❌ NA ❌ ✓
  • 5. distributed, multitenant-capable, full-text search engine. ● Built upon battle tested Lucene ● Powerful and flexible Query DSL ● Powerful Aggregations ● REST APIs for everything ● Ease with nested documents and parent-child relationships ● Suitable eco system for data pipelines The goodness of ElasticSearch
  • 6.
  • 7. What sits where ? Internet Search Service ES cluster Periodic Indexing job Postgres DB Primary datastore for core data jobs, candidates data
  • 9. Log monitoring: From a third-party solution to ELK based AWS S3 ElasticSearch cluster web & worker nodes with filebeat logstash Dashboards on Kibana Daily indices logs
  • 10.
  • 12. Recommendation engine Web Application ElasticSearch cluster web nodes with filebeat logstash User activity Kibana Dashboards Daily indices
  • 13.
  • 15. ● enable slow query log, customizable per index Search performance tuning
  • 16. ● Avoid nested documents, if you can Document modelling
  • 17. ● Deep pagination is costly with search API Use scroll API where applicable
  • 18. ● POST /unused_index/_close ● POST /index_with_more_segments/_forcemerge ● Use _rollover API to let hot/recent indexes use best servers Manage your indexes
  • 19. ● Disable indexing, storing, norms, _source when you don’t need ● Use smallest numeric data or make it keyword ● Optimize number of primary shards ● Use bulk requests, optimize their size Index performance tuning
  • 20. Summary ● Elastic stack is growing and improving - see if it fits your needs ● Defaults are good only to start - know what they are and tune them ● Different indexes for different data ● Understand your needs and model your documents well