Stop Exhausting Yourself in Operating Multiple Elasticsearch Clusters

複数のElasticsearchクラスタの運用
で消耗しないために
Hokuto Kagaya
開発2センター
ゲームプラットフォームサービス開発室
PION C チーム

In-game Community / Marketing Platform
WHAT'S PION?

• As a time series DB
• As a search engine
• As a log store
WHAT’S Elasticsearch?

Logging
WE HAVE MULTIPLE CLUSTERS FOR..
Event
Processi
ng
Service
Develop
ment
RealSandbox
Purpose
Environment

“Which clusters did I install which plugins on?”
For example..
MULTIPLE CLUSTERS WILL CAUSE..
Basically our clusters are provisioned by Ansible
BUT…
Someone: “Hey, let’s try the XXX plugin on the node YYY of the cluster
ZZZ in DEV environment!”
They forgot to record XXX, YYY, ZZZ…
Easily go down to chaos!

• ElasticHQ (OSS)
• Kibana (by Elastic)
• cerebro (OSS)
EXISTING TOOLS
For a single cluster
One of its strengths is that it can support multiple clusters
OK, let’s use
this!
However its main purpose is also deep management of a single cluster
Not for browsing a cluster list

ANOTHER PROBLEM ON Elasticsearch
Not too easy to:
monitor an Elasticsearch cluster
alert us to the abnormal status based on the result of monitoring properly
Kibana or many OSS are very nice, but:
Some detailed metrics (like latency 95%ile) cannot retrieved directly
We cannot see them when Es is under too heavy load

COMPARISON
Multiple clusters? Monitoring? Alerting?
Kibana
partial support
(cross cluster search, dedicated separate
cluster for monitoring)
partial support
(server side metrics)
✔
(with Watcher)
ElasticHQ ✔
(not for browsing)
partial support
(server side metrics)
✘
What we need
✔
(w/ high browsability)
✔ ✔
OK, let’s make it by
ourselves!

Screenshots
RUBBER BAND - TOOLKIT FOR ES MANAGEMENT

Rubber Band UI, Health Watcher, Client - architecture

TWO OPTIONS FOR MONITORING
Monitor clusters’ states directly
/_cat/***
/_cluster/health
/_nodes/***
Monitor client-side metrics
can compute detailed metrics
can access even when a cluster is highly loaded (via our tool)

HOW TO ALERT ON A CLUSTER STATUS?
The X-Pack GOLD license supports Watcher, which also can be
used to check the cluster health out-of-the-box!
{
"trigger" : {
"schedule" : { "interval" : "10s" }
},
"input" : {
"http" : {
"request" : {
"host" : "localhost",
"port" : 9200,
"path" : "/_cluster/health"
}
}
Uses cluster health API!
We can also utilize it
by ourselves:)

EXAMPLES OF ALERT FROM HEALTH WATCHER

MILESTONE
PHASE 1
Rubber Band UI
Rubber Band Health Watcher
Rubber Band Client (Simple REST client wrapper)
PHASE 2
• Rubber Band Curator (Centralized wrapper of curator)
• Open to the other internal teams
PHASE 3 • Publish it as a OSS

KEY TAKEAWAYS
How can we manage multiple clusters without any chaos?
Our toolkit: Rubber Band
A simple UI with information aggregation and appropriate delegation
How can we do proper monitoring and alerting?
Uses both of direct server states and client metrics
Implements a simple health-check server by ourselves
And..

@Component
public class ElasticsearchClientWrapper {
private final RestHighLevelClient elasticsearchClient;
private final MeterRegistry meterRegistry;
public ElasticsearchClientWrapper(RestHighLevelClient elasticsearchClient,
MeterRegistry meterRegistry) {
this.elasticsearchClient = elasticsearchClient;
this.meterRegistry = meterRegistry;
}
public void searchAndGetAggregationAsync(SearchRequest searchRequest) {
Timer.Sample sample = Timer.start(meterRegistry);
elasticsearchClient.searchAsync(searchRequest, new ActionListener<SearchResponse>() {
@Override
public void onResponse(SearchResponse searchResponse) {
sample.stop(meterRegistry.timer("metrics.timer", "success"));
// do stuff..
}
@Override
public void onFailure(Exception e) {
sample.stop(meterRegistry.timer("metrics.timer", "failure"));
// do fallback..
}
});
}
Wrap the official HighLevelRESTClient
See also: Elasticsearch を検索エンジンとして利用する際のポイント
https://engineering.linecorp.com/ja/blog/detail/99

Stop Exhausting Yourself in Operating Multiple Elasticsearch Clusters

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Stop Exhausting Yourself in Operating Multiple Elasticsearch Clusters

Semelhante a Stop Exhausting Yourself in Operating Multiple Elasticsearch Clusters (20)

Mais de LINE Corporation

Mais de LINE Corporation (20)

Último

Último (20)

Stop Exhausting Yourself in Operating Multiple Elasticsearch Clusters