SlideShare uma empresa Scribd logo
1 de 41
Baixar para ler offline
Tuning Solr for Logs 
Radu Gheorghe 
@radu0gheorghe @sematext
/me does... 
.com/logsene 
search consulting + Logsene = logging consulting
Tuning. Is it worth it? 
baseline last run 
# of logs 10M 310M 
EC2 bill/month 700 450
What to optimize for? 
capacity: how many logs 
the same hardware can keep 
while still providing decent 
performance 
http://www.seasonslogs.co.uk/images/products/SL_001.png 
https://openclipart.org/image/300px/svg_to_png/169833/Server_1U.png
What's decent performance? “It depends” 
Assumptions 
indexing: enough to keep up with generated logs* 
search concurrency 
search latency: 2s for debug queries, 5s for charts 
*account for spikes!
Enough theory, let's start testing! 
Solr instance 
m3.2xlarge (8CPU, 30GB RAM, 2x80GB SSD) 
Solr 4.10.1 
Feeder instance 
c3.2xlarge (8CPU, 15GB RAM, 2x80GB SSD) 
apache access logs 
python script to parse and feed them
Baseline test 
15GB heap 
debug query 
status:404 in the last hour 
charts query 
all time status counters 
all time top IPs 
user agent word cloud 
http://blog.sematext.com/2013/12/19/getting-started-with-logstash/
Baseline result 
12000 
10000 
8000 
6000 
4000 
2000 
0 
100K 2.5M 4M 6M 9M 10M 
debug 
charts 
EPS
12000 
10000 
8000 
6000 
4000 
2000 
0 
100K 2.5M 4M 6M 9M 10M 
debug 
charts 
EPS 
Baseline result 
capacity
12000 
10000 
8000 
6000 
4000 
2000 
0 
100K 2.5M 4M 6M 9M 10M 
debug 
charts 
EPS 
Baseline result 
capacity 
bottleneck: facets eat CPU
12000 
10000 
8000 
6000 
4000 
2000 
0 
100K 2.5M 4M 6M 9M 10M 
debug 
charts 
EPS 
Baseline result 
capacity 
on average, bottleneck: facets eat CPU 
CPU is OK
12000 
10000 
8000 
6000 
4000 
2000 
0 
100K 2.5M 4M 6M 9M 10M 
indexing limited 
because python 
scripts eats 
feeder CPU 
debug 
charts 
EPS 
Baseline result 
capacity 
bottleneck: facets eat CPU 
on average, 
CPU is OK
Indexing throughput: is it enough? 
“it depends” 
how long do you keep your logs? 
1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 
1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day 
Baseline run: 10M index fills up in <1/2h at 7K EPS
Indexing throughput: is it enough? 
“it depends” 
how long do you keep your logs? 
1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 
1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day 
how big are your spikes? (assumption: 10x regular load) 
7K EPS is enough for 10M capacity if you keep logs >5h
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
1.5M 3M 5M 8M 11M 
charts 
EPS 
debug 
Rare commits 
10% above baseline 
auto soft commits every 5 seconds 
auto hard commits every 30 minutes 
RAMBufferSize=200MB; maxBufferedDocs=10M
Same results with 
even rarer commits (auto-soft every 30s, 500MB buffer) 
omitNorms + omitTermFreqAndPositions 
larger caches 
cache autowarming 
THP disabled 
mergeFactor 5 
mergeFactor 20 
but indexing 
was cheaper 
manually ran 
queries, too
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
1.5M 3M 5M 8M 10M 12M 
charts 
EPS 
debug 
DocValues on IP and status code 
20% above baseline
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
3M 10M 18M 24M 31M 36M 
charts 
EPS 
debug 
Detour: what if user agent was string? 
3.6x baseline
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
8M 16M 24M 32M 40M 48M 56M 64M 67M 69M 70M 70.5M 
charts 
EPS 
debug 
… and if user agent used DocValues? 
6.7x baseline 
reducing indexing 
adds 5% capacity
35000 
30000 
25000 
20000 
15000 
10000 
5000 
0 
3M 7M 11M 15M 19M 23M 27M 28M 
OOM (150 collections) 
charts 
EPS 
debug 
Time based collections (1 minute) 
2.7x baseline
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
10M 40M 70M 100M 130M 160M 190M 213M 
still OOM 
(~100 collections) 
charts 
EPS 
debug 
Time based collections (10 minutes) 
21x baseline
10min collections: 20GB heap; optimize old 
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
50M 100M 150M 200M 250M 300M 310M 330M 340M 
charts 
EPS 
debug 
31x baseline, 
5 days projected retention 
with 10x spikes 
no more OOM, 
just slower queries 
34x baseline, 
10 days projected 
retention (10x)
Software optimizations recap 
Definitely worth it Nice to have I wouldn't bother 
time-based 
collections 
noop I/O scheduler merge policy tuning 
DocValues omit norms, term 
frequencies and 
positions 
autowarm 
rare soft commits optimize “old” 
collections 
super-rare soft 
commits 
disable THP
r3.2xlarge: +30GB RAM, +$0.14/h, 1x160GB SSD 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
20M 70M 120M 170M 220M 270M 320M 372M 
less indexing throughput 
than m3.2xlarge 
charts 
EPS 
debug 
37x baseline, 
9 days projected retention 
with 10x spikes
9000 
8000 
7000 
6000 
5000 
4000 
3000 
2000 
1000 
0 
20M 50M 80M 110M 140M 170M 177M 
charts 
EPS 
debug 
c3.2xlarge: -15GB RAM, -$0.14/h 
17x baseline, 
5 days projected retention 
with 10x spikes
Monthly EC2 cost per 1M logs* 
m3.2xlarge: $1.3 
r3.2xlarge: $1.33 
c3.2xlarge: $1.78 
TODO (a.k.a. truth always messes with simplicity): 
more/expensive facets => more CPU => c3 looks better 
less/cheap facets => not enough instance storage 
=> EBS (magnetic/SSD/provisioned IOPS)? 
=> storage-optimized i2? 
=> old-gen instances with magnetic instance storage? 
use different instance types for “hot” and “cold” collections? 
*on-demand pricing at 2014-11-07
How NOT to build an indexing pipeline 
custom script: 
reads apache logs from files 
parses them using regex 
takes 100% CPU and 100% RAM 
from a c3.2xlarge instance 
maxes out at 7K EPS
Enter Apache Flume* 
agent.sources = spoolSrc 
agent.sources.spoolSrc.type = spooldir 
agent.sources.spoolSrc.spoolDir = /var/log 
agent.sources.spoolSrc.channels = solrChannel 
agent.channels = solrChannel 
agent.channels.solrChannel.type = file 
agent.sinks.solrSink.channel = solrChannel 
put Solr and Morphline 
jars in lib/ 
agent.sinks = solrSink 
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink 
agent.sinks.solrSink.morphlineFile = conf/morphline.conf 
agent.sinks.solrSink.morphlineId = 1 
*Or Logstash. Or rsyslog. Or syslog-ng. Or any other specialized event processing tool 
source 
channel 
sink
morphline.conf (think Unix pipes) 
morphlines : [ 
{ id : 1 
commands : [ 
same ID as in the flume.conf 
sink definition 
{ readLine { charset : UTF-8 } } 
{ 
grok { 
dictionaryFiles : [conf/grok-patterns] 
expressions : { 
message : """%{COMBINEDAPACHELOG}""" 
} 
} 
} 
{ generateUUID { field : id } } 
{ 
loadSolr { 
solrLocator : { 
collection : collection1 
solrUrl : "http://10.233.54.118:8983/solr/" 
} 
} 
} 
] 
} 
] 
process one line at a time 
(there's also readMultiLine) 
https://github.com/cloudera/search/blob/master/samples/solr-nrt/grok-dictionaries/grok-patterns 
parses each property 
(eg: IP, status code) 
Solr can in its own field 
do it, too* 
use zkHost 
for SolrCloud 
*http://solr.pl/en/2013/07/08/automatically-generate-document-identifiers-solr-4-x/
Result: 2.4K EPS, feeder machine almost idle
2.4K EPS is typically enough for this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
scales nicely with # of servers 
but all buffering and processing 
is done here
but not for this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
centralized buffering 
and processing 
Flume agent 
Flume agent
or this 
application server 
+ Flume agent 
application server 
+ Flume agent 
application server 
+ Flume agent 
buffer, then process (separately) 
Flume agent 
Flume agent 
Flume agent
Increase throughput: batch sizes; memory channel 
agent.sources = spoolSrc 
agent.sources.spoolSrc.type = spooldir 
agent.sources.spoolSrc.spoolDir = /var/log 
agent.sources.spoolSrc.batchSize = 5000 
make sure you have enough heap 
agent.sources.spoolSrc.channels = solrChannel 
agent.channels = solrChannel 
agent.channels.solrChannel.type = file memory 
agent.channels.solrChannel.capacity = 1000000 
agent.channels.solrChannel.transactionCapacity = 5000 
agent.sinks.solrSink.channel = solrChannel 
solrLocator : { 
collection : collection1 
solrUrl : "http://10.233.54.118:8983/solr/" 
batchSize : 5000 
} 
agent.sinks = solrSink 
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink 
agent.sinks.solrSink.morphlineFile = conf/morphline.conf 
agent.sinks.solrSink.morphlineId = 1 
agent.sinks.solrSink.batchSize = 5000
Result: 10K EPS, 6%CPU usage (2x baseline)
More throughput? Parallelize 
Depends* on the bottleneck 
source channel sink 
more threads 
(if applicable) 
more sources 
*last time I use this word, I promise 
multiplexing 
channel selector 
more threads 
(if applicable) 
load balancing 
sink processor 
Source1 C1 
Source1 
C1 
Source2 
Source1 
C1 
C2 
C1 Sink1 
C1 
Sink1 
Sink2
Result: default Solr install maxed out at 24K EPS
TODO: log in JSON where you can 
Then, in morphline.conf, replace the grok command with the much ligher: 
readJson {} 
Easy with apache logs, maybe not for other apps: 
LogFormat "{  
"@timestamp": "%{%Y-%m-%dT%H:%M:%S%z}t",  
"message": "%h %l %u %t "%r" %>s %b",  
... 
"method": "%m",  
"referer": "%{Referer}i",  
"useragent": "%{User-agent}i"  
}" ls_apache_json 
CustomLog /var/log/apache2/logstash_test.ls_json ls_apache_json 
More details at: 
http://untergeek.com/2013/09/11/getting-apache-to-output-json-for-logstash-1-2-x/
Conclusions 
Use time-based collections and DocValues 
Rare soft&hard commits are good 
Pushing them too far is probably not worth it 
Hardware: test and see what works for you 
A balanced, SSD-backed machine (like m3) is a good start 
Use specialized event processing tools 
Apache Flume is a fine example 
Processing and buffering on the application server side scales better 
Buffer before [heavy] processing 
Mind your batch sizes, buffer types and parallelization 
Log in JSON where you can
Thank you! 
Feel free to poke me @radu0gheorghe 
Check us out at the booth, sematext.com and @sematext 
We're hiring, too!

Mais conteúdo relacionado

Mais procurados

Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveSematext Group, Inc.
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com琛琳 饶
 
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchRafał Kuć
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageSATOSHI TAGOMORI
 
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified LoggingGabor Kozma
 
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)Ontico
 
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introductionOwen Wu
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종NAVER D2
 
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldSATOSHI TAGOMORI
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209Tim Bunce
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyTim Bunce
 
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4琛琳 饶
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Tim Bunce
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Data Con LA
 
Logstash-Elasticsearch-Kibana
Logstash-Elasticsearch-KibanaLogstash-Elasticsearch-Kibana
Logstash-Elasticsearch-Kibanadknx01
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12N Masahiro
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerSematext Group, Inc.
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBMongoDB
 

Mais procurados (20)

Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
 
Docker Monitoring Webinar
Docker Monitoring  WebinarDocker Monitoring  Webinar
Docker Monitoring Webinar
 
From zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and ElasticsearchFrom zero to hero - Easy log centralization with Logstash and Elasticsearch
From zero to hero - Easy log centralization with Logstash and Elasticsearch
 
Data Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby UsageData Analytics Service Company and Its Ruby Usage
Data Analytics Service Company and Its Ruby Usage
 
Centralized + Unified Logging
Centralized + Unified LoggingCentralized + Unified Logging
Centralized + Unified Logging
 
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)Mасштабирование микросервисов на Go, Matt Heath (Hailo)
Mасштабирование микросервисов на Go, Matt Heath (Hailo)
 
Logstash family introduction
Logstash family introductionLogstash family introduction
Logstash family introduction
 
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
[2C1] 아파치 피그를 위한 테즈 연산 엔진 개발하기 최종
 
JRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing WorldJRuby with Java Code in Data Processing World
JRuby with Java Code in Data Processing World
 
Using Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibanaUsing Logstash, elasticsearch & kibana
Using Logstash, elasticsearch & kibana
 
Perl Memory Use 201209
Perl Memory Use 201209Perl Memory Use 201209
Perl Memory Use 201209
 
Application Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.keyApplication Logging in the 21st century - 2014.key
Application Logging in the 21st century - 2014.key
 
{{more}} Kibana4
{{more}} Kibana4{{more}} Kibana4
{{more}} Kibana4
 
Perl Memory Use - LPW2013
Perl Memory Use - LPW2013Perl Memory Use - LPW2013
Perl Memory Use - LPW2013
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
 
Logstash-Elasticsearch-Kibana
Logstash-Elasticsearch-KibanaLogstash-Elasticsearch-Kibana
Logstash-Elasticsearch-Kibana
 
Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12Dive into Fluentd plugin v0.12
Dive into Fluentd plugin v0.12
 
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on DockerRunning High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
 
Building a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDBBuilding a High-Performance Distributed Task Queue on MongoDB
Building a High-Performance Distributed Task Queue on MongoDB
 

Semelhante a Tuning Solr for Logs Performance

(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New FeaturesAmazon Web Services
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load TestingMike Harnish
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyAerospike
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesAmazon Web Services
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedis Labs
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...Amazon Web Services
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...Amazon Web Services
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyHenning Jacobs
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Henning Jacobs
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.comRenzo Tomà
 
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Henning Jacobs
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvmPrem Kuppumani
 
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbagGordon Chung
 

Semelhante a Tuning Solr for Logs Performance (20)

Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
(DAT402) Amazon RDS PostgreSQL:Lessons Learned & New Features
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Adventures in RDS Load Testing
Adventures in RDS Load TestingAdventures in RDS Load Testing
Adventures in RDS Load Testing
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 
Golang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war storyGolang Performance : microbenchmarks, profilers, and a war story
Golang Performance : microbenchmarks, profilers, and a war story
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and LatencyOptimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latency
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
 
Performance tuning jvm
Performance tuning jvmPerformance tuning jvm
Performance tuning jvm
 
Gnocchi v3 brownbag
Gnocchi v3 brownbagGnocchi v3 brownbag
Gnocchi v3 brownbag
 

Mais de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Mais de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Último

Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Último (20)

Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

Tuning Solr for Logs Performance

  • 1.
  • 2. Tuning Solr for Logs Radu Gheorghe @radu0gheorghe @sematext
  • 3. /me does... .com/logsene search consulting + Logsene = logging consulting
  • 4. Tuning. Is it worth it? baseline last run # of logs 10M 310M EC2 bill/month 700 450
  • 5. What to optimize for? capacity: how many logs the same hardware can keep while still providing decent performance http://www.seasonslogs.co.uk/images/products/SL_001.png https://openclipart.org/image/300px/svg_to_png/169833/Server_1U.png
  • 6. What's decent performance? “It depends” Assumptions indexing: enough to keep up with generated logs* search concurrency search latency: 2s for debug queries, 5s for charts *account for spikes!
  • 7. Enough theory, let's start testing! Solr instance m3.2xlarge (8CPU, 30GB RAM, 2x80GB SSD) Solr 4.10.1 Feeder instance c3.2xlarge (8CPU, 15GB RAM, 2x80GB SSD) apache access logs python script to parse and feed them
  • 8. Baseline test 15GB heap debug query status:404 in the last hour charts query all time status counters all time top IPs user agent word cloud http://blog.sematext.com/2013/12/19/getting-started-with-logstash/
  • 9. Baseline result 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS
  • 10. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity
  • 11. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity bottleneck: facets eat CPU
  • 12. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M debug charts EPS Baseline result capacity on average, bottleneck: facets eat CPU CPU is OK
  • 13. 12000 10000 8000 6000 4000 2000 0 100K 2.5M 4M 6M 9M 10M indexing limited because python scripts eats feeder CPU debug charts EPS Baseline result capacity bottleneck: facets eat CPU on average, CPU is OK
  • 14. Indexing throughput: is it enough? “it depends” how long do you keep your logs? 1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day Baseline run: 10M index fills up in <1/2h at 7K EPS
  • 15. Indexing throughput: is it enough? “it depends” how long do you keep your logs? 1M logs/day * 10 days <> 0.3M logs/day * 30 days. Both need 10M capacity 1M logs/day * 30 days? Needs 3 servers, each getting 0.3M logs/day how big are your spikes? (assumption: 10x regular load) 7K EPS is enough for 10M capacity if you keep logs >5h
  • 16. 8000 7000 6000 5000 4000 3000 2000 1000 0 1.5M 3M 5M 8M 11M charts EPS debug Rare commits 10% above baseline auto soft commits every 5 seconds auto hard commits every 30 minutes RAMBufferSize=200MB; maxBufferedDocs=10M
  • 17. Same results with even rarer commits (auto-soft every 30s, 500MB buffer) omitNorms + omitTermFreqAndPositions larger caches cache autowarming THP disabled mergeFactor 5 mergeFactor 20 but indexing was cheaper manually ran queries, too
  • 18. 8000 7000 6000 5000 4000 3000 2000 1000 0 1.5M 3M 5M 8M 10M 12M charts EPS debug DocValues on IP and status code 20% above baseline
  • 19. 8000 7000 6000 5000 4000 3000 2000 1000 0 3M 10M 18M 24M 31M 36M charts EPS debug Detour: what if user agent was string? 3.6x baseline
  • 20. 8000 7000 6000 5000 4000 3000 2000 1000 0 8M 16M 24M 32M 40M 48M 56M 64M 67M 69M 70M 70.5M charts EPS debug … and if user agent used DocValues? 6.7x baseline reducing indexing adds 5% capacity
  • 21. 35000 30000 25000 20000 15000 10000 5000 0 3M 7M 11M 15M 19M 23M 27M 28M OOM (150 collections) charts EPS debug Time based collections (1 minute) 2.7x baseline
  • 22. 8000 7000 6000 5000 4000 3000 2000 1000 0 10M 40M 70M 100M 130M 160M 190M 213M still OOM (~100 collections) charts EPS debug Time based collections (10 minutes) 21x baseline
  • 23. 10min collections: 20GB heap; optimize old 8000 7000 6000 5000 4000 3000 2000 1000 0 50M 100M 150M 200M 250M 300M 310M 330M 340M charts EPS debug 31x baseline, 5 days projected retention with 10x spikes no more OOM, just slower queries 34x baseline, 10 days projected retention (10x)
  • 24. Software optimizations recap Definitely worth it Nice to have I wouldn't bother time-based collections noop I/O scheduler merge policy tuning DocValues omit norms, term frequencies and positions autowarm rare soft commits optimize “old” collections super-rare soft commits disable THP
  • 25. r3.2xlarge: +30GB RAM, +$0.14/h, 1x160GB SSD 7000 6000 5000 4000 3000 2000 1000 0 20M 70M 120M 170M 220M 270M 320M 372M less indexing throughput than m3.2xlarge charts EPS debug 37x baseline, 9 days projected retention with 10x spikes
  • 26. 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 20M 50M 80M 110M 140M 170M 177M charts EPS debug c3.2xlarge: -15GB RAM, -$0.14/h 17x baseline, 5 days projected retention with 10x spikes
  • 27. Monthly EC2 cost per 1M logs* m3.2xlarge: $1.3 r3.2xlarge: $1.33 c3.2xlarge: $1.78 TODO (a.k.a. truth always messes with simplicity): more/expensive facets => more CPU => c3 looks better less/cheap facets => not enough instance storage => EBS (magnetic/SSD/provisioned IOPS)? => storage-optimized i2? => old-gen instances with magnetic instance storage? use different instance types for “hot” and “cold” collections? *on-demand pricing at 2014-11-07
  • 28. How NOT to build an indexing pipeline custom script: reads apache logs from files parses them using regex takes 100% CPU and 100% RAM from a c3.2xlarge instance maxes out at 7K EPS
  • 29. Enter Apache Flume* agent.sources = spoolSrc agent.sources.spoolSrc.type = spooldir agent.sources.spoolSrc.spoolDir = /var/log agent.sources.spoolSrc.channels = solrChannel agent.channels = solrChannel agent.channels.solrChannel.type = file agent.sinks.solrSink.channel = solrChannel put Solr and Morphline jars in lib/ agent.sinks = solrSink agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.solrSink.morphlineFile = conf/morphline.conf agent.sinks.solrSink.morphlineId = 1 *Or Logstash. Or rsyslog. Or syslog-ng. Or any other specialized event processing tool source channel sink
  • 30. morphline.conf (think Unix pipes) morphlines : [ { id : 1 commands : [ same ID as in the flume.conf sink definition { readLine { charset : UTF-8 } } { grok { dictionaryFiles : [conf/grok-patterns] expressions : { message : """%{COMBINEDAPACHELOG}""" } } } { generateUUID { field : id } } { loadSolr { solrLocator : { collection : collection1 solrUrl : "http://10.233.54.118:8983/solr/" } } } ] } ] process one line at a time (there's also readMultiLine) https://github.com/cloudera/search/blob/master/samples/solr-nrt/grok-dictionaries/grok-patterns parses each property (eg: IP, status code) Solr can in its own field do it, too* use zkHost for SolrCloud *http://solr.pl/en/2013/07/08/automatically-generate-document-identifiers-solr-4-x/
  • 31. Result: 2.4K EPS, feeder machine almost idle
  • 32. 2.4K EPS is typically enough for this application server + Flume agent application server + Flume agent application server + Flume agent scales nicely with # of servers but all buffering and processing is done here
  • 33. but not for this application server + Flume agent application server + Flume agent application server + Flume agent centralized buffering and processing Flume agent Flume agent
  • 34. or this application server + Flume agent application server + Flume agent application server + Flume agent buffer, then process (separately) Flume agent Flume agent Flume agent
  • 35. Increase throughput: batch sizes; memory channel agent.sources = spoolSrc agent.sources.spoolSrc.type = spooldir agent.sources.spoolSrc.spoolDir = /var/log agent.sources.spoolSrc.batchSize = 5000 make sure you have enough heap agent.sources.spoolSrc.channels = solrChannel agent.channels = solrChannel agent.channels.solrChannel.type = file memory agent.channels.solrChannel.capacity = 1000000 agent.channels.solrChannel.transactionCapacity = 5000 agent.sinks.solrSink.channel = solrChannel solrLocator : { collection : collection1 solrUrl : "http://10.233.54.118:8983/solr/" batchSize : 5000 } agent.sinks = solrSink agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink agent.sinks.solrSink.morphlineFile = conf/morphline.conf agent.sinks.solrSink.morphlineId = 1 agent.sinks.solrSink.batchSize = 5000
  • 36. Result: 10K EPS, 6%CPU usage (2x baseline)
  • 37. More throughput? Parallelize Depends* on the bottleneck source channel sink more threads (if applicable) more sources *last time I use this word, I promise multiplexing channel selector more threads (if applicable) load balancing sink processor Source1 C1 Source1 C1 Source2 Source1 C1 C2 C1 Sink1 C1 Sink1 Sink2
  • 38. Result: default Solr install maxed out at 24K EPS
  • 39. TODO: log in JSON where you can Then, in morphline.conf, replace the grok command with the much ligher: readJson {} Easy with apache logs, maybe not for other apps: LogFormat "{ "@timestamp": "%{%Y-%m-%dT%H:%M:%S%z}t", "message": "%h %l %u %t "%r" %>s %b", ... "method": "%m", "referer": "%{Referer}i", "useragent": "%{User-agent}i" }" ls_apache_json CustomLog /var/log/apache2/logstash_test.ls_json ls_apache_json More details at: http://untergeek.com/2013/09/11/getting-apache-to-output-json-for-logstash-1-2-x/
  • 40. Conclusions Use time-based collections and DocValues Rare soft&hard commits are good Pushing them too far is probably not worth it Hardware: test and see what works for you A balanced, SSD-backed machine (like m3) is a good start Use specialized event processing tools Apache Flume is a fine example Processing and buffering on the application server side scales better Buffer before [heavy] processing Mind your batch sizes, buffer types and parallelization Log in JSON where you can
  • 41. Thank you! Feel free to poke me @radu0gheorghe Check us out at the booth, sematext.com and @sematext We're hiring, too!