Elasticsearch Performance Testing and Scaling @ Signal

Performance Testing and Scaling Elasticsearch
Jo Draeger 27/11/2018

● Signal’s Use Case & Challenges
● Performance & Scaling Journey
● Live Experiments
Agenda

Signal: signalmedia.co @SignalHQ
Text Analytics Start-Up, founded in 2013
Media Monitoring & more
100 people, about 20 in tech/data science/product
We’re hiring!
Joachim Draeger: linkedin.com/in/joachimdraeger/ @joachimdraeger
Lead Software Engineer, joined two years ago
Terraformed Infrastructure, Tamed Elasticsearch, Built up Monitoring
currently developing full-stack on Signal’s User Management and Login security
Before: 10 years of Java
Signal & Me

Signal’s Use Case
& Challenges

Signal
AI Text-Analytics
Pipeline
Summarisation
Topic Classification
Entity Recognition
Story collation
Deduplication
Transformation
Content
Provider
User
Print
Online
Broadcast
Alerts
API

● PR
○ Monitor own reputation, campaigns and spokespeople
○ Monitor competition
○ Target media
○ Target topics
● Business Potentials & Risks
○ Mergers & Acquisitions
○ Corporate crisis
○ Product Launches
○ Patents
○ Tax & regulation
Use Cases

● Latest 15 months of the world’s news
● AI powered annotations
○ Entities (Apple vs apples)
○ Topics
○ Quotes
○ Sentiment
● Full text for keyword searches
● Source
● … and more
Data in Elasticsearch

● Thousands of Users with heterogeneous demands
○ Some only interested in their coverage (1 Entity)
○ Some are interested in a lot of different and specific things
○ => spiky load, sometimes caused by single user
● AI cat & mouse
○ Information needs not (yet) covered by AI annotations get modelled with keywords
○ E.g. “according to”, “said”, “declared” => Quote detection
○ E.g. positive/negative words => Sentiment
○ More and better Entities & Topics
● Queries with lots of terms are expensive!
Challenges & Usage Characteristics

Signal’s Performance
& Scaling Journey

● Be pragmatic
● Add more nodes!
● Monitoring, identify resource bottlenecks *
● Upgrade to latest ES version
● Identify and improve expensive searches *
● Find the right machine type
● Find the right number of indices and shards *
● Build a (mental) model for query cost
Signal’s Performance & Scaling Journey

● End-user latency
● Search queue & rejected searches
● CPU
● Memory
● Garbage collection: Old Gen (new JDKs are coming!)
● IO: Ops & Bytes/s
● Field Data
Monitoring

● Log all queries at source
● Miniature production
○ Proportional less/smaller servers and data
● Consider warming up caches
● Goal A: Experiment with optimisations
○ Replay in real-time
○ Watch impact with monitoring
○ Tune one thing and repeat
● Goal B: Identify expensive searches
○ Replay one search at a time
○ Filter by latency or metrics for single searches - how?
Replay Live Traffic

● Docker Compose Stack + Python/Shell Scripts
https://github.com/joachimdraeger/elasticsearch-performance-experiments
● The Signal Media One-Million News Articles Dataset
https://research.signalmedia.co/newsir16/signal-dataset.html
One month of articles, September 2015
● Indexed in 3 different ways:
○ Daily indices with 5 shards each, e.g. articles-daily-20150901
○ One index with 5 shards (articles-5)
○ One index with 1 shard (articles-1)
● One search with 4, one search with 16 terms
● Repeat each search 1000x
Live Experiment

What does this mean??
Monitoring for Performance Test?

curl localhost:9200/_nodes/stats?pretty
{
"cluster_name" : "docker-cluster",
"nodes" : {
"napxVuf_QnO8T7Z41HBKTg" : {
"ip" : "192.168.80.2:9300",
...
"indices" : {
"search" : {
"query_total" : 3900440,
"query_time_in_millis" : 1311173
},
"query_cache" : {
"hit_count" : 2394107,
"miss_count" : 212573,
"evictions" : 0
}
},
"process" : {
"cpu" : {
"total_in_millis" : 4726640
},
Metric counters for experiments
1. Get metric counter(s)
2. Execute search (n-times)
3. Get metric counter(s)
4. Calculate difference
=> metrics.py
Repeat searches n-times for more precise
readings.

● Docker Compose Stack
● Signal’s 1M articles data set
● Scripts for indexing
● 2 searches around VW diesel
● Script to run 1000 searches
● metrics.py to collect stats
● On GitHub:
tinyurl.com/esperf-2018
`Live Experiment

Private and Confidential
Results Performance Experiment

● the default number of shards will change from [5] to [1]
in 7.0.0
● Huge shards are more efficient to search (50GB!)
● One shard per server!?
● Huge shards can be difficult to move/recover
● Multiple shards => parallel indexing/searching
● Replicas for failover and balancing load
● Consider monthly/bi-weekly-quarterly/yearly indices
Last words on shards...

● Metric counters are great to measure experiments
● Shards are expensive
● Terms too!
● Elasticsearch use cases are diverse - it depends!
Summary

https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
https://www.elastic.co/blog/signal-media-optimizing-for-more-elasticsearch-power-with-less-elasticsearch-cl
uster
Further Sources

Any Questions?
@joachimdraeger

Thank you!
We are hiring!
tinyurl.com/signal-engineering-video
linkedin.com/company/signalmedia/
signalmedia.co/solve-big-challenges/
@joachimdraeger

Elasticsearch Performance Testing and Scaling @ Signal

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Elasticsearch Performance Testing and Scaling @ Signal

Semelhante a Elasticsearch Performance Testing and Scaling @ Signal (20)

Último

Último (20)

Elasticsearch Performance Testing and Scaling @ Signal