Sematext engineer Rafal Kuc (@kucrafal) walks through the details of running high-performance, fault tolerant Elasticsearch clusters on Docker. Topics include: Containers vs. Virtual Machines, running the official Elasticsearch container, container constraints, good network practices, dealing with storage, data-only Docker volumes, scaling, time-based data, multiple tiers and tenants, indexing with and without routing, querying with and without routing, routing vs. no routing, and monitoring. Talk was delivered at DevOps Days Warsaw 2015.
15. Containers vs Virtual Machines
Hardware
Host Operating System
Traditional Virtual Machine
16. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Traditional Virtual Machine
17. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Traditional Virtual Machine
18. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Traditional Virtual Machine
19. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Traditional Virtual Machine
20. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Traditional Virtual MachineContainer
21. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Traditional Virtual MachineContainer
22. Containers vs Virtual Machines
Hardware
Host Operating System
Hypervisor
Guest OS Guest OS
Libraries Libraries
Application 1 Application 2
Hardware
Host Operating System
Docker Engine
Libraries Libraries
Application 1 Application 2
Traditional Virtual MachineContainer
37. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch
-Dnetwork.publish_host=192.168.1.1
38. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch
-Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
-Dnetwork.publish_host=192.168.1.1
39. Dealing With Network
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
$ docker run -d elasticsearch
-Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
-Dnetwork.publish_host=192.168.1.1
$ docker run -d -p 9200:9200 -p 9300:9300 elasticsearch
-Dnetwork.publish_host=0.0.0.0
40. Network - Good Practices
Separate network for Elasticsearch cluster
41. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h es_node_1 elasticsearch
42. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h es_node_1 elasticsearch
Expose 9200 & 9300 ports only for client nodes
43. Network - Good Practices
Separate network for Elasticsearch cluster
Common host names for containers
$ docker run -d -h es_node_1 elasticsearch
Expose 9200 & 9300 ports only for client nodes
Elasticsearch data & client nodes point to masters only
46. Dealing With Storage
By default in /usr/share/elasticsearch/data
By default not persisted
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
47. Dealing With Storage
$ docker run -d
-v /opt/elasticsearch/data:/usr/share/elasticsearch/data
elasticsearch
By default in /usr/share/elasticsearch/data
By default not persisted
Use data only containers
Permissions
50. Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
51. Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
Permissions
52. Data-Only Docker Volumes
Bypasses Union File System
Can be shared between containers
Data volumes persist if the container itself is deleted
$ docker create -v /mnt/es/data:/usr/share/elasticsearch/data
--name esdata elasticsearch
$ docker run --volumes-from esdata elasticsearch
53. Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
54. Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
55. Highly Available Cluster
Master only
Master only
Master only
Data only
Data only
Data only
Data only
Data only
Data only
Client only
Client only
minimum_master_nodes = N/2 + 1
recovery.after.nodes
recovery.expected.nodes
cluster.routing.allocation.node_concurrent_
recoveries
index.unassigned.node_left.delayed_timeout
index.priority
90. Routing vs No Routing
Queries without routing (200 shards, 1 replica)
#threads Avg response time Throughput 90% line Median CPU Utilization
1 3169ms 19,0/min 5214ms 2692ms 95 – 99%
91. Routing vs No Routing
Queries without routing (200 shards, 1 replica)
#threads Avg response time Throughput 90% line Median CPU Utilization
1 3169ms 19,0/min 5214ms 2692ms 95 – 99%
Queries with routing (200 shards, 1 replica)
#threads Avg response time Throughput 90% line Median CPU Utilization
10 196ms 50,6/sec 642ms 29ms 25 – 40%
20 218ms 91,2/sec 718ms 11ms 10 – 15%
94. We Are Hiring!
Dig Search?
Dig Analytics?
Dig Big Data?
Dig Performance?
Dig Logging?
Dig working with, and in, open–source?
We’re hiring worldwide!
http://sematext.com/about/jobs.html
Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically
Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically
Problems with standard deployment like:
Resources not utilized
Need to provision machines before deployment
Differences between development, test, QA and production environments
Hard to scale automatically
Amazon EC2 container service
Spoonium
Kubernetes
RKT
Shards and Replicas
Shards and Replicas
Shards and Replicas
Shards and Replicas
Shards and Replicas
Shards and Replicas
Shards and Replicas
No time based indices
No tiers
Possible solution - routing
No time based indices
No tiers
Possible solution - routing
No time based indices
No tiers
Possible solution - routing