SlideShare a Scribd company logo
1 of 30
ELKARCHITECTURE
Uzzal Basak
DatabaseEngineer
"ELK" is the formed forthree open source projects: Elasticsearch,
Logstash, and Kibana. Elasticsearch is a search and analytics engine.
Logstash is a server side data processing pipeline that ingests data from‑
multiple sources simultaneously, transforms it, and then sends it to a
"stash" like Elasticsearch. Kibana lets users visualize data with charts
and graphs in Elasticsearch.
Beats is installed in source side working as data shipper, basically send
data from source to logstash orelasticsearch.
The Elastic Stack is the next evolution of the ELKStack
Deep DriveOn Elasticsearch
• Compare terms with Elasticsearch and Relation Database
System
• Elasticsearch  Terms (Index, Type, Documents, ID)
• Elasticsearch Cluster, Nodes, Shards and Replicas Concepts
• Elasticsearch Scalability and Availability  
• Elasticsearch Installation 
• Elasticsearch GETAPI, PUTAPI, POSTAPI, DELETEAPI and
BULKupload.
• Elasticsearch  Repository snapshot backup and restoration
• Elasticsearch Head Plugin forGraphical Visualization 
• Why Elasticsearch is very fast ?
• Strengths and limitations of Elasticsearch.
• Best two successful test-case for Elasticsearch.
Compare terms with Elasticsearch and Relation
Database System
MySQL=> Databases => Tables => Columns/Rows
Elasticsearch => Indexes=> Types => Documents with Properties
Index:
•Collection of document having similarcharacteristics.
•Equivalent to a database instance in a Relation Database.
•Mapping which defines multiple types.
•Logical namespace to map one ormore primary shards
•Can have Zero ormore replicas
Types:
•Equivalent to table in a relation database
•Each type has a list of fields
•Mapping defines how each field is analyzed
Elasticsearch  Terms (Index, Type, Documents, ID)
Example of Index and Types:
•In carmanufacturing scenario, has a Factory index. Within this
index, you have three different types:
•People
•Cars
•Spare Parts
Document:
•JSON document stored in Elasticsearch
•Equivalent to a row in a relational database
ID:
•Unique identifierto identify a document
•Combination of index, type and id must be unique to be able to
identify a document deterministically.
•The usercan specify the id but id can also be auto generated by
Elasticsearch if it is not provided.
Elasticsearch Cluster, Nodes, Shards and Replicas Concepts
Cluster:
•Collection of nodes
•Identified by a unique
Nodes
•Each of the serverof a clusteris called a Node. In elasticsearch nodes
are divided into three ways according to theirfunctionality.
MasterNode
•Light weighted node
•Responsible forclustermanagement
•Ensures the clusterstable
•It is not recommended to send index orsearch request to this node
Data Node
•Responsible forstoring actual data
•Participates in indexing process
Client Node
•Acts as a Load balancerforprocessing requests
•Used to perform scatter/gatherbased operation like search
•Neitherstores the data norparticipates in clustermanagement
•Relieves data node to do heavy duty of searching
Shards
•Logical unit to store data
•Each document is stored in single primary shard
•By default each Index has five shards
Replicas
•Each Index can have 0 ormore replicas
•Helps with fail over, performance
•Replicas are neverstored on the same node as primary shard
node
•Can be changed afterindex creation. 
Elasticsearch Scalability and Availability
* Elasticsearch Scalability
Partitioning data across multiple machines allows Elasticsearch to
scale beyond what a single machine do and support high throughput
operations. Data is divided into small parts called shards. As the
index is distributed across multiple shards, a query against an index
is executed in parallel across all the shards. The results from each
shard are then gathered and sent back to the client. Executing the
query in parallel greatly improves the search performance.
Elasticsearch Scalability
In Previous Page Diagram index is distributed into 3 nodes ,
that's why Node 1 and node 2 has consists 2 index each and
node 3 consists one index. Afterfew days lateruserfound
that 3 nodes are not enough usercan add two more nodes
then total node will be 5 and 5 shards are distributed in each
node.
Note: Once userdefines numberof shards can not change it
aftercreation of index.
Elasticsearch Availability
• Hereisthreedatanodewemention number of shardsof thisindex is
3 so three3 shardsaredivided into threenodesoneshard each. On
theother hand mention that number of replicasis1 that'senough for
configuration, elasticsearch distributed of onenodeshardsreplicaon
another node.
Elasticsearch Availability
• Afterthat forany reason one of the data node 4 is down now situation is
below
But system is accessible on that moments cause node 4 shard 2 has replica
has R2 which is node 3 becomes primary node and node 4 R3 has primary
data in node 5 that is shard 3.
• Note: Configuration of maintain is availability is so easy just
add index.number_of_replicas: 1 in elasticsearch config
Elasticsearch Installation
It is two part one is System Configuration anotheris
Elasticsearch config file configuration.
System Configuration:
edit /etc/sysctl.conf
fs.file-max = 70000
vm.max_map_count=300000
/sbin/sysctl -p
Add the following lines to the "/etc/security/limits.conf" file
oracle soft nproc 16384
oracle hard nproc 16384
oracle soft nofile 4096
oracle hard nofile 65536
oracle soft stack 10240
AfterChanging the above configuration , you must reboot the machine
before start the elasticsearch server
Elasticsearch Installation
• X-Pack plugin Installation
• From enterprise-grade security and developer-friendly APIs to machine learning, and
graph analytics, the Elastic Stack ships with features (formerly packaged as X-Pack).
• Note: Make sure Elasticsearch version and X-pack version should be same, otherwise
getting error
• X-pack plugin Installing Step
• [12:48:22 oracle@test2 bin]$ ./elasticsearch-plugin install
file:/home/oracle/ELK/x_Pack/x-pack-6.2.2.zip
• Plugin list check command
• ./elasticsearch-plugin list
Elasticsearch Installation
• Config File Configuration:
• Every product of ELK(Elasticsearch, Logstash and Kibana all configuration depends on
config file.
• cluster.name: my-application Use a descriptive name foryourcluster:→
index.number_of_shards : 5 Default value is 5 means all the type are→
index.number_of_replicas: 0
node.name: node-2 Use a descriptive name forthe node:→
path.data: /home/oracle/ELK/data Path to directory where to store the data→
(separate multiple locations by comma):
path.logs: /home/oracle/ELK/log Path to log files:→
path.repo: /home/oracle/ELK/backup Path forbackup snapshot→
bootstrap.memory_lock: true Lock the memory on startup:→
network.host: 192.168.56.103 Set the bind address to a specific IP(IPv4 or→
IPv6): 
discovery.zen.ping.unicast.hosts: ["192.168.56.103", "192.168.56.101"] →
Nodes IPaddress 
discovery.zen.minimum_master_nodes: 1
transport.host: localhost Should be localhost or127.0.0.1→
transport.tcp.port: 9300
http.port: 9200
node.master: true
node.data : true
xpack.graph.enabled: true
xpack.logstash.enabled: true
xpack.ml.enabled: true
xpack.monitoring.enabled: true
xpack.watcher.enabled: true
Elasticsearch GET API, PUT API, POST API , DELETE API and bulk
upload
• Example of GETAPI
• [16:41:12 oracle@ansible elasticsearch-6.2.2]$ curl -XGET'
http://192.168.56.103:9200/car/car/AXkFe2kBPzPPA2iKNVqD?pretty‘
•
{
"_index" : "car",
"_type" : "car",
"_id" : "AXkFe2kBPzPPA2iKNVqD",
"_version" : 1,
"found" : true,
"_source" : {
"powerPS" : 100,
"@version" : "1",
"minPrice" : 1500,
"sdPrice" : 2261.647559,
"path" : "/home/oracle/ELK/logstash/car.csv",
"@timestamp" : "2019-03-14T07:05:29.692Z",
"host" : "test2",
"message" : "599,150000,2009,100,1500,28500,6066.1219,2261.647559r",
"km" : 150000,
"avgPrice" : 6066.1219,
"year" : 2009,
"count" : 599,
"maxPrice" : 28500
}
}
• Key Note:
• 1st caris Index Database→
• 2nd caris type Table→
• 3rd AXkFe2kBPzPPA2iKNVqDs is ID
Elasticsearch GET API, PUT API, POST API , DELETE API and bulk
upload
Example of PUTand POSTAPI
curl - XPUT 'http://192.168.56.103:9200/twitter/_doc/1?
pretty’
curl - XPOST 'http://192.168.56.103:9200/twitter/_doc?
pretty’
•Both are working as inserting documents of the table if you
use XPUTyou must define the unique id , if you don’t define
id then insert is failed .So to avoid this scenario just use
XPOSTmethod.
Example of DELETEAPI
 curl -XDELETE
'http://192.168.56.103:9200/car/car/hY6ncWkB43xMmVfK_oJ
V‘
{"_index":"car","_type":"car","_id":"hY6ncWkB43xMmVfK_oJV",
"_version":2,"result":"deleted","_shards":
{"total":2,"successful":2,"failed":0},"_seq_no":348,"_primary_ter
m":5}
Elasticsearch GET API, PUT API, POST API , DELETE API and bulk
upload
Example of bulk upload
•ForImporting Heavy data in ELKprovide BULKAPI. Single
Records failure doesn’t impact yourwhole insert operation.
curl -s -H'Content-Type: application/x-ndjson' -XPOST
192.168.56.103:9200/test/_bulk?pretty --data-binary
@actresses.json
Elasticsearch  Repository snapshot backup and restoration
• Main pre-requisite of taking snapshot is defined Repository. ForRepository
setup need to configure repo.path in elasticsearch.yml file .
• path.repo: ["/home/oracle/ELK/backup"]
 Advantage of Snapshot
• Snapshots are incremental
• Reduces both the time and disk usage
• Manage clusterevent
Elasticsearch  Repository snapshot backup and restoration
Creating Repository command
curl -H'Content-Type: application/json' -XPUT
'http://192.168.56.101:9200/_snapshot/my_repository' -d '{
  "type": "fs",
  "settings": {
    "location": "my_repository",
    "compress": true
  }
}'
Output Should be
{"acknowledged":true}
Description:
Type: fs (shared File system), It can be S3 , HDFS (Hadoop), Azure cloud and google
cloud.
Compression: true means only metadata is compressed not data
•Verify Repository command
• curl -XGET'http://192.168.56.101:9200/_snapshot/my_repository'
Elasticsearch  Repository snapshot backup and restoration
Creating Snapshot
curl -H'Content-Type: application/json' -XPUT
"http://192.168.56.101:9200/_snapshot/my_repository/snap_1?
wait_for_completion=true&pretty" -d '{
"indices": ".watches",
"ignore_unavailable": "true",
"include_global_state": false
}'
snap_1 : snapshot name
indices: table name formultiple table "test1,test2,",
ignore_unavailable: true means any indices is not present then snapshot is failed
include_global_state: true, false , partial. True means primary shared available snapshot
will fail
Verify Snapshot
curl -HXGET"http://192.168.56.101:9200/_snapshot/my_repository/snap_1?pretty"
Elasticsearch  Repository snapshot backup and restoration
• Snapshot Restore
• curl -H'Content-Type: application/json' -XPOST
"http://192.168.56.101:9200/_snapshot/my_repository/snap
_2/_restore?pretty" -d '{
"indices": ".monitoring-es-6-2019.03.11",
"ignore_unavailable": "true",
"include_global_state": false 
}'
• Delete Snapshot
• curl -H
XDELETE http://192.168.56.101:9200/_snapshot/my_reposit
ory/snap_1?pretty
Elasticsearch  Head plugin
Elasticsearch Head Plugin forGoogle Chrome is very useful
tools you can run all types of select query, delete index ,
analyse index.
Why Elasticsearch is very fast ?
Percolation
•Suppose a userregistered a query in a elasticsearch, each new record is inserted into
the elasticsearch as perpercolation state.
It's like a reverse operation of indexing then searching. A real time example, the huge
network log is generated now need to store that log into the elasticsearch index and
generated a real time alarm from this. These logs contain severity type (1-5) and event
type. In here elasticsearch Percolation mechanism, data is inserted as peruserpre-
registered query into a elasticsearch.
So any matching query is generated triggers at the time of inserted into elasticsearch and
data are distributed as perpre-registered query.
Why Elasticsearch is very fast ?
Inverted index
An inverted index isadatabaseindex storing amapping from content, such aswordsor
numbers, to itslocationsin atable, or in adocument or aset of documents. Thepurpose
of an inverted index isto allow fast full-text searches, at acost of increased processing
when adocument isadded to thedatabase. Theinverted filemay bethedatabasefile
itself, rather than itsindex. It isthemost popular datastructureused in document
retrieval systems, used on alargescalefor examplein search enginesin elasticsearch.
In below exampleif user search “going” word then it’sdirectly search from document
1.
Strengths and limitations of Elasticsearch
• The strengths of Elasticsearch are as follows:
• Very flexible Query API:
• It supports JSON-based RESTAPI.
• Clients are available forall majorlanguages, such as Java, Python, PHP, and so on.
• It supports filtering, sort, pagination, and aggregations in the same query.
• Highly scalable:
• Clustering, replication of data, automatic failover are supported out of the box and are completely
transparent to the user. Formore details, referto the Availability and Horizontal Scalability section.
• Multi-language support:
• We discussed how stemming works and why it is important to remove the difference between the
different forms of root words. This process is completely different fordifferent languages.
Elasticsearch supports many languages out of the box.
• Aggregations:
• Aggregations are one of the reasons why Elasticsearch is like nothing out there.
• It comes with very a powerful analytics engine, which can help you slice and dice yourdata.
• It supports nested aggregations. Forexample, you can group users first by the city they live in and
then by theirgenderand then calculate the average age of each bucket.
• Performance:
• Due to the inverted index and the distributed nature, it is extremely high performing. The queries
you traditionally run using a batch processing engine, such as Hadoop, can now be executed in real
time.
• Intelligent filtercaching:
• The most recently used queries are cached. When the data is modified, the cache is invalidated
automatically.
Strengths and limitations of Elasticsearch
The limitations of Elasticsearch are as follows:
•Not real time - eventual consistency (nearreal time):
• The data you index is only available forsearch after1 sec. A process
known as refresh wakes up every 1 sec by default and makes the
data searchable.
•Doesn't support SQLlike joins but provides parent-child and
nested to handle relations.
•Doesn't support transactions and rollbacks: 
Transactions in a distributed system are expensive. It offers
version-based control to make sure the update is happening on
the latest version of the document.
•Updates are expensive. 
An update on the existing document deletes the document and
re-inserts it as a new document.
•Elasticsearch might lose data due to the following reasons:
• Network partitions.
• Multiple nodes going down at the same time.
Best two successful test-case for Elasticsearch
NASA using elasticsearch forfinding geo thermal parameteron
MARS.
•In the exploration of Mars, NASA sent Red Planet roveron Mars, billions of data
and images they get from theirsensor. So, it’s very tough task forNASA how to
handle and make decision from this Telemetry data. To operate this MARS rover
from earth, need to know Geo thermal parameter. Any mistake of finding Geo
thermal data could cause 2 billion dollarmachine, forovercoming this high data
analytics NASA usage elasticsearch engine.
More details please follow below link
https://www.elastic.co/elasticon/2015/sf/unlocking-interplanetary-
datasets-with-real-time-search
Best two successful test-case for Elasticsearch
UberCarfinding.
•Main point of this to get data userGeo-location data and
share the close passengerlocation data to driverand book the
carwithin the seconds. Elasticsearch helps this types of
business a lot. Basically, elasticsearch stores real time
passengergeo-location (longitude and latitude) and matches
the closest available drive location and sent the userlocation.
Formore details please check below link
•https://www.infoq.com/presentations/uber-elasticsearch-
clusters
Upcoming …
Next session deep drive with
Beats, Logstash and Kibana
Afterthat Final Session Iwill discuss with few demo
* Raw file uploaded in Elasticsearch do some filtering in
Logstash and Create report using Kibana
* logfile uploaded in Elasticsearch by using filebeats and
Logstash.
* Uploaded data from Relational Database to Elasticsearch by
using Logstash.
Main Goal of my this Operation is
visualize data in Kibana no matterthe
source is.
End

More Related Content

What's hot

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
DataStax
 

What's hot (20)

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
 
Elasticsearch From the Bottom Up
Elasticsearch From the Bottom UpElasticsearch From the Bottom Up
Elasticsearch From the Bottom Up
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
What's new in Elasticsearch v5
What's new in Elasticsearch v5What's new in Elasticsearch v5
What's new in Elasticsearch v5
 
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
C* for Deep Learning (Andrew Jefferson, Tracktable) | Cassandra Summit 2016
 
Log analysis with elastic stack
Log analysis with elastic stackLog analysis with elastic stack
Log analysis with elastic stack
 
DataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL WorkshopDataEngConf SF16 - Spark SQL Workshop
DataEngConf SF16 - Spark SQL Workshop
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Meetup070416 Presentations
Meetup070416 PresentationsMeetup070416 Presentations
Meetup070416 Presentations
 
Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)Elasticsearch quick Intro (English)
Elasticsearch quick Intro (English)
 
Log analytics with ELK stack
Log analytics with ELK stackLog analytics with ELK stack
Log analytics with ELK stack
 
Elk presentation 2#3
Elk presentation 2#3Elk presentation 2#3
Elk presentation 2#3
 
Spark Sql for Training
Spark Sql for TrainingSpark Sql for Training
Spark Sql for Training
 
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
 
Spark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 FuriousSpark And Cassandra: 2 Fast, 2 Furious
Spark And Cassandra: 2 Fast, 2 Furious
 
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
CocoaHeads PDX 2014 01 23 : CoreData and iCloud Improvements iOS7 / OSX Maver...
 
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, AirbnbAirbnb Search Architecture: Presented by Maxim Charkov, Airbnb
Airbnb Search Architecture: Presented by Maxim Charkov, Airbnb
 

Similar to Elk presentation1#3

How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
琛琳 饶
 

Similar to Elk presentation1#3 (20)

ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search EngineElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
 
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar SeriesLog Analytics with Amazon Elasticsearch Service - September Webinar Series
Log Analytics with Amazon Elasticsearch Service - September Webinar Series
 
Devnexus 2018
Devnexus 2018Devnexus 2018
Devnexus 2018
 
Dev nexus 2017
Dev nexus 2017Dev nexus 2017
Dev nexus 2017
 
Elastic stack Presentation
Elastic stack PresentationElastic stack Presentation
Elastic stack Presentation
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
Using ElasticSearch as a fast, flexible, and scalable solution to search occu...
Using ElasticSearch as a fast, flexible, and scalable solution to search occu...Using ElasticSearch as a fast, flexible, and scalable solution to search occu...
Using ElasticSearch as a fast, flexible, and scalable solution to search occu...
 
Couchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data DemystifiedCouchbase Data Platform | Big Data Demystified
Couchbase Data Platform | Big Data Demystified
 
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Clustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmarkClustrix Database Percona Ruby on Rails benchmark
Clustrix Database Percona Ruby on Rails benchmark
 
Modernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with ElasticsearchModernizing WordPress Search with Elasticsearch
Modernizing WordPress Search with Elasticsearch
 

More from uzzal basak (10)

MongoDB Sharding
MongoDB ShardingMongoDB Sharding
MongoDB Sharding
 
Oracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby databaseOracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby database
 
12c db upgrade from 11.2.0.4
12c db upgrade from 11.2.0.412c db upgrade from 11.2.0.4
12c db upgrade from 11.2.0.4
 
Encrypt and decrypt in solaris system
Encrypt and decrypt in solaris systemEncrypt and decrypt in solaris system
Encrypt and decrypt in solaris system
 
Oracle table partition step
Oracle table partition stepOracle table partition step
Oracle table partition step
 
Oracle business intelligence enterprise edition 11g
Oracle business intelligence enterprise edition 11gOracle business intelligence enterprise edition 11g
Oracle business intelligence enterprise edition 11g
 
EMC Networker installation Document
EMC Networker installation DocumentEMC Networker installation Document
EMC Networker installation Document
 
Oracle Audit vault
Oracle Audit vaultOracle Audit vault
Oracle Audit vault
 
Schema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12cSchema replication using oracle golden gate 12c
Schema replication using oracle golden gate 12c
 
Oracle data guard configuration in 12c
Oracle data guard configuration in 12cOracle data guard configuration in 12c
Oracle data guard configuration in 12c
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 

Elk presentation1#3

  • 2. "ELK" is the formed forthree open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server side data processing pipeline that ingests data from‑ multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch. Beats is installed in source side working as data shipper, basically send data from source to logstash orelasticsearch. The Elastic Stack is the next evolution of the ELKStack
  • 3. Deep DriveOn Elasticsearch • Compare terms with Elasticsearch and Relation Database System • Elasticsearch  Terms (Index, Type, Documents, ID) • Elasticsearch Cluster, Nodes, Shards and Replicas Concepts • Elasticsearch Scalability and Availability   • Elasticsearch Installation  • Elasticsearch GETAPI, PUTAPI, POSTAPI, DELETEAPI and BULKupload. • Elasticsearch  Repository snapshot backup and restoration • Elasticsearch Head Plugin forGraphical Visualization  • Why Elasticsearch is very fast ? • Strengths and limitations of Elasticsearch. • Best two successful test-case for Elasticsearch.
  • 4. Compare terms with Elasticsearch and Relation Database System MySQL=> Databases => Tables => Columns/Rows Elasticsearch => Indexes=> Types => Documents with Properties
  • 5. Index: •Collection of document having similarcharacteristics. •Equivalent to a database instance in a Relation Database. •Mapping which defines multiple types. •Logical namespace to map one ormore primary shards •Can have Zero ormore replicas Types: •Equivalent to table in a relation database •Each type has a list of fields •Mapping defines how each field is analyzed Elasticsearch  Terms (Index, Type, Documents, ID)
  • 6. Example of Index and Types: •In carmanufacturing scenario, has a Factory index. Within this index, you have three different types: •People •Cars •Spare Parts Document: •JSON document stored in Elasticsearch •Equivalent to a row in a relational database ID: •Unique identifierto identify a document •Combination of index, type and id must be unique to be able to identify a document deterministically. •The usercan specify the id but id can also be auto generated by Elasticsearch if it is not provided.
  • 7. Elasticsearch Cluster, Nodes, Shards and Replicas Concepts Cluster: •Collection of nodes •Identified by a unique Nodes •Each of the serverof a clusteris called a Node. In elasticsearch nodes are divided into three ways according to theirfunctionality. MasterNode •Light weighted node •Responsible forclustermanagement •Ensures the clusterstable •It is not recommended to send index orsearch request to this node Data Node •Responsible forstoring actual data •Participates in indexing process
  • 8. Client Node •Acts as a Load balancerforprocessing requests •Used to perform scatter/gatherbased operation like search •Neitherstores the data norparticipates in clustermanagement •Relieves data node to do heavy duty of searching Shards •Logical unit to store data •Each document is stored in single primary shard •By default each Index has five shards Replicas •Each Index can have 0 ormore replicas •Helps with fail over, performance •Replicas are neverstored on the same node as primary shard node •Can be changed afterindex creation. 
  • 9. Elasticsearch Scalability and Availability * Elasticsearch Scalability Partitioning data across multiple machines allows Elasticsearch to scale beyond what a single machine do and support high throughput operations. Data is divided into small parts called shards. As the index is distributed across multiple shards, a query against an index is executed in parallel across all the shards. The results from each shard are then gathered and sent back to the client. Executing the query in parallel greatly improves the search performance.
  • 10. Elasticsearch Scalability In Previous Page Diagram index is distributed into 3 nodes , that's why Node 1 and node 2 has consists 2 index each and node 3 consists one index. Afterfew days lateruserfound that 3 nodes are not enough usercan add two more nodes then total node will be 5 and 5 shards are distributed in each node. Note: Once userdefines numberof shards can not change it aftercreation of index.
  • 11. Elasticsearch Availability • Hereisthreedatanodewemention number of shardsof thisindex is 3 so three3 shardsaredivided into threenodesoneshard each. On theother hand mention that number of replicasis1 that'senough for configuration, elasticsearch distributed of onenodeshardsreplicaon another node.
  • 12. Elasticsearch Availability • Afterthat forany reason one of the data node 4 is down now situation is below But system is accessible on that moments cause node 4 shard 2 has replica has R2 which is node 3 becomes primary node and node 4 R3 has primary data in node 5 that is shard 3. • Note: Configuration of maintain is availability is so easy just add index.number_of_replicas: 1 in elasticsearch config
  • 13. Elasticsearch Installation It is two part one is System Configuration anotheris Elasticsearch config file configuration. System Configuration: edit /etc/sysctl.conf fs.file-max = 70000 vm.max_map_count=300000 /sbin/sysctl -p Add the following lines to the "/etc/security/limits.conf" file oracle soft nproc 16384 oracle hard nproc 16384 oracle soft nofile 4096 oracle hard nofile 65536 oracle soft stack 10240 AfterChanging the above configuration , you must reboot the machine before start the elasticsearch server
  • 14. Elasticsearch Installation • X-Pack plugin Installation • From enterprise-grade security and developer-friendly APIs to machine learning, and graph analytics, the Elastic Stack ships with features (formerly packaged as X-Pack). • Note: Make sure Elasticsearch version and X-pack version should be same, otherwise getting error • X-pack plugin Installing Step • [12:48:22 oracle@test2 bin]$ ./elasticsearch-plugin install file:/home/oracle/ELK/x_Pack/x-pack-6.2.2.zip • Plugin list check command • ./elasticsearch-plugin list
  • 15. Elasticsearch Installation • Config File Configuration: • Every product of ELK(Elasticsearch, Logstash and Kibana all configuration depends on config file. • cluster.name: my-application Use a descriptive name foryourcluster:→ index.number_of_shards : 5 Default value is 5 means all the type are→ index.number_of_replicas: 0 node.name: node-2 Use a descriptive name forthe node:→ path.data: /home/oracle/ELK/data Path to directory where to store the data→ (separate multiple locations by comma): path.logs: /home/oracle/ELK/log Path to log files:→ path.repo: /home/oracle/ELK/backup Path forbackup snapshot→ bootstrap.memory_lock: true Lock the memory on startup:→ network.host: 192.168.56.103 Set the bind address to a specific IP(IPv4 or→ IPv6):  discovery.zen.ping.unicast.hosts: ["192.168.56.103", "192.168.56.101"] → Nodes IPaddress  discovery.zen.minimum_master_nodes: 1 transport.host: localhost Should be localhost or127.0.0.1→ transport.tcp.port: 9300 http.port: 9200 node.master: true node.data : true xpack.graph.enabled: true xpack.logstash.enabled: true xpack.ml.enabled: true xpack.monitoring.enabled: true xpack.watcher.enabled: true
  • 16. Elasticsearch GET API, PUT API, POST API , DELETE API and bulk upload • Example of GETAPI • [16:41:12 oracle@ansible elasticsearch-6.2.2]$ curl -XGET' http://192.168.56.103:9200/car/car/AXkFe2kBPzPPA2iKNVqD?pretty‘ • { "_index" : "car", "_type" : "car", "_id" : "AXkFe2kBPzPPA2iKNVqD", "_version" : 1, "found" : true, "_source" : { "powerPS" : 100, "@version" : "1", "minPrice" : 1500, "sdPrice" : 2261.647559, "path" : "/home/oracle/ELK/logstash/car.csv", "@timestamp" : "2019-03-14T07:05:29.692Z", "host" : "test2", "message" : "599,150000,2009,100,1500,28500,6066.1219,2261.647559r", "km" : 150000, "avgPrice" : 6066.1219, "year" : 2009, "count" : 599, "maxPrice" : 28500 } } • Key Note: • 1st caris Index Database→ • 2nd caris type Table→ • 3rd AXkFe2kBPzPPA2iKNVqDs is ID
  • 17. Elasticsearch GET API, PUT API, POST API , DELETE API and bulk upload Example of PUTand POSTAPI curl - XPUT 'http://192.168.56.103:9200/twitter/_doc/1? pretty’ curl - XPOST 'http://192.168.56.103:9200/twitter/_doc? pretty’ •Both are working as inserting documents of the table if you use XPUTyou must define the unique id , if you don’t define id then insert is failed .So to avoid this scenario just use XPOSTmethod. Example of DELETEAPI  curl -XDELETE 'http://192.168.56.103:9200/car/car/hY6ncWkB43xMmVfK_oJ V‘ {"_index":"car","_type":"car","_id":"hY6ncWkB43xMmVfK_oJV", "_version":2,"result":"deleted","_shards": {"total":2,"successful":2,"failed":0},"_seq_no":348,"_primary_ter m":5}
  • 18. Elasticsearch GET API, PUT API, POST API , DELETE API and bulk upload Example of bulk upload •ForImporting Heavy data in ELKprovide BULKAPI. Single Records failure doesn’t impact yourwhole insert operation. curl -s -H'Content-Type: application/x-ndjson' -XPOST 192.168.56.103:9200/test/_bulk?pretty --data-binary @actresses.json
  • 19. Elasticsearch  Repository snapshot backup and restoration • Main pre-requisite of taking snapshot is defined Repository. ForRepository setup need to configure repo.path in elasticsearch.yml file . • path.repo: ["/home/oracle/ELK/backup"]  Advantage of Snapshot • Snapshots are incremental • Reduces both the time and disk usage • Manage clusterevent
  • 20. Elasticsearch  Repository snapshot backup and restoration Creating Repository command curl -H'Content-Type: application/json' -XPUT 'http://192.168.56.101:9200/_snapshot/my_repository' -d '{   "type": "fs",   "settings": {     "location": "my_repository",     "compress": true   } }' Output Should be {"acknowledged":true} Description: Type: fs (shared File system), It can be S3 , HDFS (Hadoop), Azure cloud and google cloud. Compression: true means only metadata is compressed not data •Verify Repository command • curl -XGET'http://192.168.56.101:9200/_snapshot/my_repository'
  • 21. Elasticsearch  Repository snapshot backup and restoration Creating Snapshot curl -H'Content-Type: application/json' -XPUT "http://192.168.56.101:9200/_snapshot/my_repository/snap_1? wait_for_completion=true&pretty" -d '{ "indices": ".watches", "ignore_unavailable": "true", "include_global_state": false }' snap_1 : snapshot name indices: table name formultiple table "test1,test2,", ignore_unavailable: true means any indices is not present then snapshot is failed include_global_state: true, false , partial. True means primary shared available snapshot will fail Verify Snapshot curl -HXGET"http://192.168.56.101:9200/_snapshot/my_repository/snap_1?pretty"
  • 22. Elasticsearch  Repository snapshot backup and restoration • Snapshot Restore • curl -H'Content-Type: application/json' -XPOST "http://192.168.56.101:9200/_snapshot/my_repository/snap _2/_restore?pretty" -d '{ "indices": ".monitoring-es-6-2019.03.11", "ignore_unavailable": "true", "include_global_state": false  }' • Delete Snapshot • curl -H XDELETE http://192.168.56.101:9200/_snapshot/my_reposit ory/snap_1?pretty
  • 23. Elasticsearch  Head plugin Elasticsearch Head Plugin forGoogle Chrome is very useful tools you can run all types of select query, delete index , analyse index.
  • 24. Why Elasticsearch is very fast ? Percolation •Suppose a userregistered a query in a elasticsearch, each new record is inserted into the elasticsearch as perpercolation state. It's like a reverse operation of indexing then searching. A real time example, the huge network log is generated now need to store that log into the elasticsearch index and generated a real time alarm from this. These logs contain severity type (1-5) and event type. In here elasticsearch Percolation mechanism, data is inserted as peruserpre- registered query into a elasticsearch. So any matching query is generated triggers at the time of inserted into elasticsearch and data are distributed as perpre-registered query.
  • 25. Why Elasticsearch is very fast ? Inverted index An inverted index isadatabaseindex storing amapping from content, such aswordsor numbers, to itslocationsin atable, or in adocument or aset of documents. Thepurpose of an inverted index isto allow fast full-text searches, at acost of increased processing when adocument isadded to thedatabase. Theinverted filemay bethedatabasefile itself, rather than itsindex. It isthemost popular datastructureused in document retrieval systems, used on alargescalefor examplein search enginesin elasticsearch. In below exampleif user search “going” word then it’sdirectly search from document 1.
  • 26. Strengths and limitations of Elasticsearch • The strengths of Elasticsearch are as follows: • Very flexible Query API: • It supports JSON-based RESTAPI. • Clients are available forall majorlanguages, such as Java, Python, PHP, and so on. • It supports filtering, sort, pagination, and aggregations in the same query. • Highly scalable: • Clustering, replication of data, automatic failover are supported out of the box and are completely transparent to the user. Formore details, referto the Availability and Horizontal Scalability section. • Multi-language support: • We discussed how stemming works and why it is important to remove the difference between the different forms of root words. This process is completely different fordifferent languages. Elasticsearch supports many languages out of the box. • Aggregations: • Aggregations are one of the reasons why Elasticsearch is like nothing out there. • It comes with very a powerful analytics engine, which can help you slice and dice yourdata. • It supports nested aggregations. Forexample, you can group users first by the city they live in and then by theirgenderand then calculate the average age of each bucket. • Performance: • Due to the inverted index and the distributed nature, it is extremely high performing. The queries you traditionally run using a batch processing engine, such as Hadoop, can now be executed in real time. • Intelligent filtercaching: • The most recently used queries are cached. When the data is modified, the cache is invalidated automatically.
  • 27. Strengths and limitations of Elasticsearch The limitations of Elasticsearch are as follows: •Not real time - eventual consistency (nearreal time): • The data you index is only available forsearch after1 sec. A process known as refresh wakes up every 1 sec by default and makes the data searchable. •Doesn't support SQLlike joins but provides parent-child and nested to handle relations. •Doesn't support transactions and rollbacks:  Transactions in a distributed system are expensive. It offers version-based control to make sure the update is happening on the latest version of the document. •Updates are expensive.  An update on the existing document deletes the document and re-inserts it as a new document. •Elasticsearch might lose data due to the following reasons: • Network partitions. • Multiple nodes going down at the same time.
  • 28. Best two successful test-case for Elasticsearch NASA using elasticsearch forfinding geo thermal parameteron MARS. •In the exploration of Mars, NASA sent Red Planet roveron Mars, billions of data and images they get from theirsensor. So, it’s very tough task forNASA how to handle and make decision from this Telemetry data. To operate this MARS rover from earth, need to know Geo thermal parameter. Any mistake of finding Geo thermal data could cause 2 billion dollarmachine, forovercoming this high data analytics NASA usage elasticsearch engine. More details please follow below link https://www.elastic.co/elasticon/2015/sf/unlocking-interplanetary- datasets-with-real-time-search
  • 29. Best two successful test-case for Elasticsearch UberCarfinding. •Main point of this to get data userGeo-location data and share the close passengerlocation data to driverand book the carwithin the seconds. Elasticsearch helps this types of business a lot. Basically, elasticsearch stores real time passengergeo-location (longitude and latitude) and matches the closest available drive location and sent the userlocation. Formore details please check below link •https://www.infoq.com/presentations/uber-elasticsearch- clusters
  • 30. Upcoming … Next session deep drive with Beats, Logstash and Kibana Afterthat Final Session Iwill discuss with few demo * Raw file uploaded in Elasticsearch do some filtering in Logstash and Create report using Kibana * logfile uploaded in Elasticsearch by using filebeats and Logstash. * Uploaded data from Relational Database to Elasticsearch by using Logstash. Main Goal of my this Operation is visualize data in Kibana no matterthe source is. End