SlideShare uma empresa Scribd logo
1 de 61
Baixar para ler offline
Building	Resilient	
Log	Aggregation	Pipeline	
Using	Elasticsearch and	Kafka
Rafał Kuć @	Sematext Group,	Inc.
Sematext &	I
Logsene
SPM
logs
metrics
Next	30	minutes…
Log	shipping	
- buffers
- protocols
- parsing
Central	buffering
- Kafka
- Redis
Storage	&	Analysis
- Elasticsearch
- Kibana
- Grafana
Log	shipping	architecture
File Shipper
File Shipper
File Shipper
Centralized
Buffer
ES ES ES
ES ES ES
ES ES ES
data
Focus:	Elasticsearch
File Shipper
File Shipper
File Shipper
Centralized
Buffer
ES ES ES
ES ES ES
ES ES ES
data
Elasticsearch	cluster	architecture
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Dedicated	masters	please
client
client
client
data
data
data
data
data
data
master
master
master
discovery.zen.minimum_master_nodes ->	N/2	+	1	master	eligible	nodes
ingest
ingest
ingest
One	big	index	is	a	no-go
Not	scalable	enough	for	time	based	data
One	big	index	is	a	no-go
Indexing	slows	down	with	time
One	big	index	is	a	no-go
Expensive	merges
One	big	index	is	a	no-go
Delete by	query needed	for	data	retention
One	big	index	is	a	no-go
Not	scalable	enough	for	time	based	data
Indexing	slows	down	with	time
Expensive	merges
Delete by	query needed	for	data	retention
Daily	indices	are	a	good	start
2016.11.18 2016.11.19 2016.11.22 2016.11.23.	.	.
Indexing is	faster for	smaller	indices
Deletes are	cheap	
Search can	be	performed	on	indices	that	are	needed
Static indices	are	cache	friendly
indexing
most	searches
Daily	indices	are	a	good	start
2016.11.18 2016.11.19 2016.11.22 2016.11.23.	.	.
Indexing is	faster for	smaller	indices
Deletes are	cheap	
Search can	be	performed	on	indices	that	are	needed
Static indices	are	cache	friendly
indexing
most	searches
We	delete whole	indices
Daily	indices	are	sub-optimal
black	
friday
saturday
sunday
load
is	not
even
Size	based	indices	are	optimal
size	limit	for	indices
logs_01
indexing
around	5	– 10GB	per	shard	on	AWS
Size	based	indices	are	optimal
size	limit	for	indices
logs_01
indexing
around	5	– 10GB	per	shard	on	AWS
Size	based	indices	are	optimal
size	limit	for	indices
logs_01
indexing
logs_02
around	5	– 10GB	per	shard	on	AWS
Size	based	indices	are	optimal
size	limit	for	indices
logs_01
indexing
logs_02
around	5	– 10GB	per	shard	on	AWS
Size	based	indices	are	optimal
size	limit	for	indices
logs_01 logs_02
indexing
logs_N.	.	.
around	5	– 10GB	per	shard	on	AWS
Slice	using	size
Predictable searching	and	indexing	performance
Better indices	balancing
Fewer	shards
Easier handling of	spiky	loads
Less	costs	because	of	better hardware	utilization
Proper	Elasticsearch	configuration
Keep	index.refresh_interval at	maximum	possible	value
1	sec	->	100%,	5	sec	->	125%,	30	sec	-> 175%	
You	can	loosen up	merges
- possible	because	of	heavy	aggregation	use
- segments_per_tier ->	higher
- max_merge_at_once->	higher
- max_merged_segment ->	lower
All	prefixed	with	index.merge.policy
} higher	indexing	
throughput
Proper	Elasticsearch	configuration
Index only	needed	fields
Use	doc	values
Do	not	index	_source
Do	not	store	_all
Optimization	time
We	can	optimize data	nodes	for	time	based	data
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Hot	– cold	architecture
ES	hot ES	cold ES	cold
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold
Hot	– cold	architecture
logs_2016.11.22
ES	hot ES	cold ES	cold
-Dnode.attr.tag=hot -Dnode.attr.tag=cold -Dnode.attr.tag=cold
curl	-XPUT	localhost:9200/logs_2016.11.22 -d	'{	
"settings"	:	{		
"index.routing.allocation.exclude.tag"	:	"cold",	
"index.routing.allocation.include.tag"	:	"hot"	
}
}'
Hot	– cold	architecture
logs_2016.11.22
ES	hot ES	cold ES	cold
indexing
Hot	– cold	architecture
logs_2016.11.22
logs_2016.11.23
ES	hot ES	cold ES	cold
indexing
Hot	– cold	architecture
logs_2016.11.22
logs_2016.11.23
ES	hot ES	cold ES	cold
indexing
move	index	after	day	ends
curl	-XPUT	localhost:9200/logs_2016.11.22/_settings	-d	'{
"index.routing.allocation.exclude.tag"	:	"hot",
"index.routing.allocation.include.tag”	:	"cold"
}'
Hot	– cold	architecture
logs_2016.11.23 logs_2016.11.22
ES	hot ES	cold ES	cold
indexing
Hot	– cold	architecture
logs_2016.11.23
logs_2016.11.24
logs_2016.11.22
ES	hot ES	cold ES	cold
indexing
Hot	– cold	architecture
logs_2016.11.23
logs_2016.11.24
logs_2016.11.22
ES	hot ES	cold ES	cold
indexing
move	index	after	day	ends
Hot	– cold	architecture
logs_2016.11.24 logs_2016.11.22 logs_2016.11.23
ES	hot ES	cold ES	cold
indexing
Hot	– cold	architecture
Hot	ES	Tier
Good	CPU
Lots	of	I/O
Cold	ES	Tier
Memory	bound
Decent	I/O
ES	cold
Cold	ES	Tier
Memory	bound
Decent	I/O
Hot	– cold	architecture	summary
ES	cold
Optimize	costs – different	hardware	for	different	tier
Performance – use	case	optimized	hardware
Isolation – long	running	searches	don’t	affect	indexing
Elasticsearch	client node	needs
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearch	client node	needs
No	data	=	no	IOPS
Large	query	throughput	=	high	CPU	usage
Lots	of	results	=	high	memory usage
Lots	of	concurrent	queries	=	higher	resources utilization
Elasticsearch	ingest node	needs
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearch	ingest	node	needs
No	data	=	no	IOPS
Large	index	throughput	=	high	CPU	&	memory	usage
Complicated	rules	=	high	CPU	usage
Larger	documents	=	more	resources utilization
Elasticsearch	master node	needs
client
client
client
data
data
data
data
data
data
master
master
master
ingest
ingest
ingest
Elasticsearch	ingest	node	needs
No	data	=	no	IOPS
Large	number	of	indices	=	high	CPU	&	memory	usage
Complicated	mappings	=	high	memory	usage
Daily	indices	=	spikes	in	resources utilization
Focus:	Centralized	Buffer
File Shipper
File Shipper
File Shipper
Centralized
Buffer
ES ES ES
ES ES ES
ES ES ES
data
Why	Apache	Kafka?
Fast &	easy	to	use
Easy	to	scale
Fault	tolerant	and	highly	available
Supports	streaming
Works	in	publish/subscribe mode
Kafka	architecture
ZooKeeper
ZooKeeper
ZooKeeper
Kafka
Kafka
KafkaKafka
Kafka	&	topics
security_logs access_logs
app1_logs app2_logs
Kafka	stores	data
in topics	
written	on	disk
Kafka	&	topics	&	partitions	&	replicas
logs
partition	2
logs
partition	1
logs
partition	3
logs
partition	4
logs		replica
partition	2
logs		replica
partition	1
logs		replica
partition	3
logs		replica
partition	4
Scaling	Kafka
logs
partition	1
Scaling	Kafka
logs
partition	1
logs
partition	2
logs
partition	3
logs
partition		4
Scaling	Kafka
logs
partition	1
logs
partition	2
logs
partition	3
logs
partition		4
logs
partition	5
logs
partition	6
logs
partition	7
logs
partition	8
logs
partition	9
logs
partition	10
logs
partition	11
logs
partition	12
logs
partition	13
logs
partition	14
logs
partition	15
logs
partition	16
Things	to	remember	when	using	Kafka
Scales by	adding more	partitions not	threads
The	more	IOPS the	better
Keep	the	#	of	consumers	equal	to	#	of	partitions
Replicas used	for	HA and	FT only
Offsets stored	per	consumer	– multiple	destinations
easily	possible
Focus:	Shipper
File Shipper
File Shipper
File Shipper
Centralized
Buffer
ES ES ES
ES ES ES
ES ES ES
data
What	about	the	shipper?
logs
Centralized
Buffer
Which	shipper	to	use?
Which	protocol should	be	used
What	about	the	buffering
Log	to	JSON or	parse and	how
Buffers
performance & availability
batches	&	threads when	central	buffer	is	gone
Buffer	types
Disk ||	memory ||	combined	hybrid approach
On	source	||	centralized
App
Buffer
App
Buffer
file	or	local	log	shipper
easy	scaling	– fewer	moving	parts
often	with	the	use	of	lightweight	shipper
App
App
Kafka /	Redis /	Logstash /	etc…
one	place	for	all	changes
extra	features	made	easy	(like	TTL)
ES
ES
Buffers	Summary
Simple Reliable
App
Buffer
App
Buffer
ES
App
App
ES
Protocols
UDP	– fast,	cool	for	the	application,	not	reliable
TCP – reliable	(almost) application	gets	ACK when	written to	buffer
Application level	ACKs	may	be	needed
HTTP
RELP
Beats
Kafka
Logstash,	rsyslog,	Fluentd
Logstash,	rsyslog
Logstash,	Filebeat
Logstash,	rsyslog,	Filebeat,	Fluentd
Choosing	the	shipper
application
rsyslog Elasticsearch
http
socket
memory	&	disk	
assisted	queues
Choosing	the	shipper
application
rsyslog Elasticsearch
http
socket
memory	&	disk	
assisted	queues
application
file
rsyslog
filebeat
consumer
What	about	OS?
Say	NO to	swap
Set	the	right	disk	scheduler
CFQ for	spinning	disks
deadline for	SSD
Use	proper	mount options	for	ext4
noatime
nodirtime
data=writeback,	nobarier
For	bare	metal
check	CPU	governor
disable	transparent	huge	pages
/proc/sys/vm/nr_hugepages=0
We	are	engineers!
We	develop DevOps	tools!
We	are	DevOps people!
We	do	fun	stuff	;)
http://sematext.com/jobs
Thank	you	for	listening!	Get	in	touch!
Rafał
rafal.kuc@sematext.com
@kucrafal
http://sematext.com
@sematext http://sematext.com/jobs
Come	talk	to	us
at	the	booth

Mais conteúdo relacionado

Mais procurados

Querying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaQuerying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaYaroslav Tkachenko
 
Deploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerDeploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerVu Nguyen Duy
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationPatrick Di Loreto
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKShu-Jeng Hsieh
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestKrishna Gade
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQLSATOSHI TAGOMORI
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Databricks
 
Airstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbAirstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbJen Aman
 
The Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLThe Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLDatadog
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouDatabricks
 
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationsTop 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationshadooparchbook
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchHakka Labs
 
Spark Internals Training | Apache Spark | Spark | Anika Technologies
Spark Internals Training | Apache Spark | Spark | Anika TechnologiesSpark Internals Training | Apache Spark | Spark | Anika Technologies
Spark Internals Training | Apache Spark | Spark | Anika TechnologiesAnand Narayanan
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraSpark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal
 

Mais procurados (20)

Querying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS AthenaQuerying Data Pipeline with AWS Athena
Querying Data Pipeline with AWS Athena
 
Big Data Tools in AWS
Big Data Tools in AWSBig Data Tools in AWS
Big Data Tools in AWS
 
Deploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and dockerDeploy data analysis pipeline with mesos and docker
Deploy data analysis pipeline with mesos and docker
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
A New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDKA New Chapter of Data Processing with CDK
A New Chapter of Data Processing with CDK
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
 
Airstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At AirbnbAirstream: Spark Streaming At Airbnb
Airstream: Spark Streaming At Airbnb
 
Spark Working Environment in Windows OS
Spark Working Environment in Windows OSSpark Working Environment in Windows OS
Spark Working Environment in Windows OS
 
The Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQLThe Data Mullet: From all SQL to No SQL back to Some SQL
The Data Mullet: From all SQL to No SQL back to Some SQL
 
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye ZhouMetrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
Metrics-Driven Tuning of Apache Spark at Scale with Edwina Lu and Ye Zhou
 
Top 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applicationsTop 5 mistakes when writing Streaming applications
Top 5 mistakes when writing Streaming applications
 
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series searchDataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - High cardinality time series search
 
Spark Internals Training | Apache Spark | Spark | Anika Technologies
Spark Internals Training | Apache Spark | Spark | Anika TechnologiesSpark Internals Training | Apache Spark | Spark | Anika Technologies
Spark Internals Training | Apache Spark | Spark | Anika Technologies
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per DayHadoop summit - Scaling Uber’s Real-Time Infra for  Trillion Events per Day
Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 

Semelhante a DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using Elasticsearch and Kafka

Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudRick Bilodeau
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudStreamsets Inc.
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveTorsten Steinbach
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsYousun Jeong
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun JeongSpark Summit
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...HostedbyConfluent
 
Powering Interactive Data Analysis at Pinterest by Amazon Redshift
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftPowering Interactive Data Analysis at Pinterest by Amazon Redshift
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftJie Li
 
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...Sungmin Kim
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark Summit
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackElasticsearch
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureLuan Moreno Medeiros Maciel
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataVMware Tanzu
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataCarlos Andrés García
 

Semelhante a DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using Elasticsearch and Kafka (20)

Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
 
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco IntercloudCase Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
Case Study: Elasticsearch Ingest Using StreamSets @ Cisco Intercloud
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
Building Data Warehouses and Data Lakes in the Cloud - DevDay Austin 2017 Day 2
 
IBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep DiveIBM Cloud Day January 2021 Data Lake Deep Dive
IBM Cloud Day January 2021 Data Lake Deep Dive
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network Analytics
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun Jeong
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
 
Powering Interactive Data Analysis at Pinterest by Amazon Redshift
Powering Interactive Data Analysis at Pinterest by Amazon RedshiftPowering Interactive Data Analysis at Pinterest by Amazon Redshift
Powering Interactive Data Analysis at Pinterest by Amazon Redshift
 
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
AWS Analytics Immersion Day - Build BI System from Scratch (Day1, Day2 Full V...
 
Spark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with SparkSpark and Couchbase: Augmenting the Operational Database with Spark
Spark and Couchbase: Augmenting the Operational Database with Spark
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Amazon Kinesis
Amazon KinesisAmazon Kinesis
Amazon Kinesis
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
 
Using Data Lakes
Using Data Lakes Using Data Lakes
Using Data Lakes
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 
High performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyDataHigh performance Spark distribution on PKS by SnappyData
High performance Spark distribution on PKS by SnappyData
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

DOD 2016 - Rafał Kuć - Building a Resilient Log Aggregation Pipeline Using Elasticsearch and Kafka