SlideShare uma empresa Scribd logo
1 de 25
Baixar para ler offline
Performance Testing and Scaling Elasticsearch
Jo Draeger 27/11/2018
● Signal’s Use Case & Challenges
● Performance & Scaling Journey
● Live Experiments
Agenda
Signal: signalmedia.co @SignalHQ
Text Analytics Start-Up, founded in 2013
Media Monitoring & more
100 people, about 20 in tech/data science/product
We’re hiring!
Joachim Draeger: linkedin.com/in/joachimdraeger/ @joachimdraeger
Lead Software Engineer, joined two years ago
Terraformed Infrastructure, Tamed Elasticsearch, Built up Monitoring
currently developing full-stack on Signal’s User Management and Login security
Before: 10 years of Java
Signal & Me
Signal’s Use Case
& Challenges
Signal
AI Text-Analytics
Pipeline
Summarisation
Topic Classification
Entity Recognition
Story collation
Deduplication
Transformation
Content
Provider
User
Print
Online
Broadcast
Alerts
API
● PR
○ Monitor own reputation, campaigns and spokespeople
○ Monitor competition
○ Target media
○ Target topics
● Business Potentials & Risks
○ Mergers & Acquisitions
○ Corporate crisis
○ Product Launches
○ Patents
○ Tax & regulation
Use Cases
Private and Confidential
● Latest 15 months of the world’s news
● AI powered annotations
○ Entities (Apple vs apples)
○ Topics
○ Quotes
○ Sentiment
● Full text for keyword searches
● Source
● … and more
Data in Elasticsearch
● Thousands of Users with heterogeneous demands
○ Some only interested in their coverage (1 Entity)
○ Some are interested in a lot of different and specific things
○ => spiky load, sometimes caused by single user
● AI cat & mouse
○ Information needs not (yet) covered by AI annotations get modelled with keywords
○ E.g. “according to”, “said”, “declared” => Quote detection
○ E.g. positive/negative words => Sentiment
○ More and better Entities & Topics
● Queries with lots of terms are expensive!
Challenges & Usage Characteristics
Signal’s Performance
& Scaling Journey
● Be pragmatic
● Add more nodes!
● Monitoring, identify resource bottlenecks *
● Upgrade to latest ES version
● Identify and improve expensive searches *
● Find the right machine type
● Find the right number of indices and shards *
● Build a (mental) model for query cost
Signal’s Performance & Scaling Journey
● End-user latency
● Search queue & rejected searches
● CPU
● Memory
● Garbage collection: Old Gen (new JDKs are coming!)
● IO: Ops & Bytes/s
● Field Data
Monitoring
● Log all queries at source
● Miniature production
○ Proportional less/smaller servers and data
● Consider warming up caches
● Goal A: Experiment with optimisations
○ Replay in real-time
○ Watch impact with monitoring
○ Tune one thing and repeat
● Goal B: Identify expensive searches
○ Replay one search at a time
○ Filter by latency or metrics for single searches - how?
Replay Live Traffic
Live Experiments
● Docker Compose Stack + Python/Shell Scripts
https://github.com/joachimdraeger/elasticsearch-performance-experiments
● The Signal Media One-Million News Articles Dataset
https://research.signalmedia.co/newsir16/signal-dataset.html
One month of articles, September 2015
● Indexed in 3 different ways:
○ Daily indices with 5 shards each, e.g. articles-daily-20150901
○ One index with 5 shards (articles-5)
○ One index with 1 shard (articles-1)
● One search with 4, one search with 16 terms
● Repeat each search 1000x
Live Experiment
What does this mean??
Monitoring for Performance Test?
curl localhost:9200/_nodes/stats?pretty
{
"cluster_name" : "docker-cluster",
"nodes" : {
"napxVuf_QnO8T7Z41HBKTg" : {
"ip" : "192.168.80.2:9300",
...
"indices" : {
"search" : {
"query_total" : 3900440,
"query_time_in_millis" : 1311173
},
"query_cache" : {
"hit_count" : 2394107,
"miss_count" : 212573,
"evictions" : 0
}
},
"process" : {
"cpu" : {
"total_in_millis" : 4726640
},
Metric counters for experiments
1. Get metric counter(s)
2. Execute search (n-times)
3. Get metric counter(s)
4. Calculate difference
=> metrics.py
Repeat searches n-times for more precise
readings.
● Docker Compose Stack
● Signal’s 1M articles data set
● Scripts for indexing
● 2 searches around VW diesel
● Script to run 1000 searches
● metrics.py to collect stats
● On GitHub:
tinyurl.com/esperf-2018
`Live Experiment
Private and Confidential
Results Performance Experiment
Summary
● the default number of shards will change from [5] to [1]
in 7.0.0
● Huge shards are more efficient to search (50GB!)
● One shard per server!?
● Huge shards can be difficult to move/recover
● Multiple shards => parallel indexing/searching
● Replicas for failover and balancing load
● Consider monthly/bi-weekly-quarterly/yearly indices
Last words on shards...
● Metric counters are great to measure experiments
● Shards are expensive
● Terms too!
● Elasticsearch use cases are diverse - it depends!
Summary
https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
https://www.elastic.co/blog/signal-media-optimizing-for-more-elasticsearch-power-with-less-elasticsearch-cl
uster
Further Sources
Any Questions?
tinyurl.com/esperf-2018
@joachimdraeger
Thank you!
We are hiring!
tinyurl.com/signal-engineering-video
linkedin.com/company/signalmedia/
signalmedia.co/solve-big-challenges/
tinyurl.com/esperf-2018
@joachimdraeger

Mais conteúdo relacionado

Semelhante a Elasticsearch Performance Testing and Scaling @ Signal

Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeIdo Shilon
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at TwitterPrasad Wagle
 
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...StormForge .io
 
Building search and discovery services for Schibsted (LSRS '17)
Building search and discovery services for Schibsted (LSRS '17)Building search and discovery services for Schibsted (LSRS '17)
Building search and discovery services for Schibsted (LSRS '17)Sandra Garcia
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your EnterpriseWSO2
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Provectus
 
Glasswall Wardley Maps & Services
Glasswall Wardley Maps & ServicesGlasswall Wardley Maps & Services
Glasswall Wardley Maps & ServicesSteve Purkis
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsZhenxiao Luo
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Omid Vahdaty
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Omid Vahdaty
 
Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!DataWorks Summit
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...LibbySchulze
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Jonathan Singer
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Demi Ben-Ari
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Codemotion
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
 

Semelhante a Elasticsearch Performance Testing and Scaling @ Signal (20)

Production ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ wazeProduction ready big ml workflows from zero to hero daniel marcous @ waze
Production ready big ml workflows from zero to hero daniel marcous @ waze
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
Your Testing Is Flawed: Introducing A New Open Source Tool For Accurate Kuber...
 
Building search and discovery services for Schibsted (LSRS '17)
Building search and discovery services for Schibsted (LSRS '17)Building search and discovery services for Schibsted (LSRS '17)
Building search and discovery services for Schibsted (LSRS '17)
 
Analytics in Your Enterprise
Analytics in Your EnterpriseAnalytics in Your Enterprise
Analytics in Your Enterprise
 
Monitoring AI with AI
Monitoring AI with AIMonitoring AI with AI
Monitoring AI with AI
 
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
Data Summer Conf 2018, “Monitoring AI with AI (RUS)” — Stepan Pushkarev, CTO ...
 
Glasswall Wardley Maps & Services
Glasswall Wardley Maps & ServicesGlasswall Wardley Maps & Services
Glasswall Wardley Maps & Services
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!Counting Unique Users in Real-Time: Here's a Challenge for You!
Counting Unique Users in Real-Time: Here's a Challenge for You!
 
Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...Intro to open source observability with grafana, prometheus, loki, and tempo(...
Intro to open source observability with grafana, prometheus, loki, and tempo(...
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
 
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 

Último

WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 

Último (20)

WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 

Elasticsearch Performance Testing and Scaling @ Signal

  • 1. Performance Testing and Scaling Elasticsearch Jo Draeger 27/11/2018
  • 2. ● Signal’s Use Case & Challenges ● Performance & Scaling Journey ● Live Experiments Agenda
  • 3. Signal: signalmedia.co @SignalHQ Text Analytics Start-Up, founded in 2013 Media Monitoring & more 100 people, about 20 in tech/data science/product We’re hiring! Joachim Draeger: linkedin.com/in/joachimdraeger/ @joachimdraeger Lead Software Engineer, joined two years ago Terraformed Infrastructure, Tamed Elasticsearch, Built up Monitoring currently developing full-stack on Signal’s User Management and Login security Before: 10 years of Java Signal & Me
  • 5. Signal AI Text-Analytics Pipeline Summarisation Topic Classification Entity Recognition Story collation Deduplication Transformation Content Provider User Print Online Broadcast Alerts API
  • 6. ● PR ○ Monitor own reputation, campaigns and spokespeople ○ Monitor competition ○ Target media ○ Target topics ● Business Potentials & Risks ○ Mergers & Acquisitions ○ Corporate crisis ○ Product Launches ○ Patents ○ Tax & regulation Use Cases
  • 8. ● Latest 15 months of the world’s news ● AI powered annotations ○ Entities (Apple vs apples) ○ Topics ○ Quotes ○ Sentiment ● Full text for keyword searches ● Source ● … and more Data in Elasticsearch
  • 9. ● Thousands of Users with heterogeneous demands ○ Some only interested in their coverage (1 Entity) ○ Some are interested in a lot of different and specific things ○ => spiky load, sometimes caused by single user ● AI cat & mouse ○ Information needs not (yet) covered by AI annotations get modelled with keywords ○ E.g. “according to”, “said”, “declared” => Quote detection ○ E.g. positive/negative words => Sentiment ○ More and better Entities & Topics ● Queries with lots of terms are expensive! Challenges & Usage Characteristics
  • 11. ● Be pragmatic ● Add more nodes! ● Monitoring, identify resource bottlenecks * ● Upgrade to latest ES version ● Identify and improve expensive searches * ● Find the right machine type ● Find the right number of indices and shards * ● Build a (mental) model for query cost Signal’s Performance & Scaling Journey
  • 12. ● End-user latency ● Search queue & rejected searches ● CPU ● Memory ● Garbage collection: Old Gen (new JDKs are coming!) ● IO: Ops & Bytes/s ● Field Data Monitoring
  • 13. ● Log all queries at source ● Miniature production ○ Proportional less/smaller servers and data ● Consider warming up caches ● Goal A: Experiment with optimisations ○ Replay in real-time ○ Watch impact with monitoring ○ Tune one thing and repeat ● Goal B: Identify expensive searches ○ Replay one search at a time ○ Filter by latency or metrics for single searches - how? Replay Live Traffic
  • 15. ● Docker Compose Stack + Python/Shell Scripts https://github.com/joachimdraeger/elasticsearch-performance-experiments ● The Signal Media One-Million News Articles Dataset https://research.signalmedia.co/newsir16/signal-dataset.html One month of articles, September 2015 ● Indexed in 3 different ways: ○ Daily indices with 5 shards each, e.g. articles-daily-20150901 ○ One index with 5 shards (articles-5) ○ One index with 1 shard (articles-1) ● One search with 4, one search with 16 terms ● Repeat each search 1000x Live Experiment
  • 16. What does this mean?? Monitoring for Performance Test?
  • 17. curl localhost:9200/_nodes/stats?pretty { "cluster_name" : "docker-cluster", "nodes" : { "napxVuf_QnO8T7Z41HBKTg" : { "ip" : "192.168.80.2:9300", ... "indices" : { "search" : { "query_total" : 3900440, "query_time_in_millis" : 1311173 }, "query_cache" : { "hit_count" : 2394107, "miss_count" : 212573, "evictions" : 0 } }, "process" : { "cpu" : { "total_in_millis" : 4726640 }, Metric counters for experiments 1. Get metric counter(s) 2. Execute search (n-times) 3. Get metric counter(s) 4. Calculate difference => metrics.py Repeat searches n-times for more precise readings.
  • 18. ● Docker Compose Stack ● Signal’s 1M articles data set ● Scripts for indexing ● 2 searches around VW diesel ● Script to run 1000 searches ● metrics.py to collect stats ● On GitHub: tinyurl.com/esperf-2018 `Live Experiment
  • 19. Private and Confidential Results Performance Experiment
  • 21. ● the default number of shards will change from [5] to [1] in 7.0.0 ● Huge shards are more efficient to search (50GB!) ● One shard per server!? ● Huge shards can be difficult to move/recover ● Multiple shards => parallel indexing/searching ● Replicas for failover and balancing load ● Consider monthly/bi-weekly-quarterly/yearly indices Last words on shards...
  • 22. ● Metric counters are great to measure experiments ● Shards are expensive ● Terms too! ● Elasticsearch use cases are diverse - it depends! Summary
  • 25. Thank you! We are hiring! tinyurl.com/signal-engineering-video linkedin.com/company/signalmedia/ signalmedia.co/solve-big-challenges/ tinyurl.com/esperf-2018 @joachimdraeger