SlideShare uma empresa Scribd logo
1 de 29
Baixar para ler offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scalable, secure log analytics with
Amazon ES
Carl Meadows
Principal Product Manager
AWS – Search Services
A D B 3 0 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Source: TechCrunch survey of popular open source software from April ’17
• Sometimes referred to as the “ELK Stack”
– Elasticsearch, Logstash & Kibana
• Distributed search and analytics engine
built on Apache Lucene
• Easy ingestion and visualization
What is Elasticsearch?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Machine data driving Elasticsearch growth
Machine-generated data is growing 10x faster than business data… Logs, logs, and more logs
IT & DevOps: Databases,
servers, storage,
networking
Increase in IoT and mobile
devices: Gaming, sensors, web
content
Cloud-based
architectures
Source: insideBigData, “The Exponential Growth of Data,” February 16, 2017
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Application DataServer, application,
network, AWS, and
other logs
Amazon Elasticsearch Service domain
with index(es)
How it works
1. Send data as JSON via REST APIs
2. Data is indexed—all fields searchable, including nested
JSON
3. Queries, via REST APIs, allow fielded matching,
Boolean expressions, include sorting and analysis
1
2
3
Application users, analysts, DevOps, security
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon Elasticsearch Service is a fully
managed service that makes it easy to
deploy, manage, and scale Elasticsearch
and Kibana in the AWS Cloud
Amazon Elasticsearch Service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Cloud
Elasticsearch runs on a cluster of instances
VPC
Data nodes Master nodes
AWS Management Console
AWS Command Line Interface
(AWS CLI)
AWS tools and SDKs
AWS CloudFormation
AWS Identity and
Access
Management (IAM)
Elastic Load Balancing (ELB)
AWS CloudTrailAmazon CloudWatch AWS Database
Migration Service
Amazon Kinesis Data
Firehose
Amazon
CloudWatch
Logs
Amazon ES domain
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Infrastructure monitoring: OS transport
• ELK: Elasticsearch, Logstash, and Kibana
• Beats: lightweight log shippers
• For high-volume writes, use a buffer—Redis in this case
Logstash on
Amazon EC2
Amazon EC2 Amazon ElastiCache
for Redis
Amazon Elasticsearch
Service
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Infrastructure monitoring/SIEM: AWS-centric
• Employing streaming technologies at higher volumes
• Some customers use Amazon CloudWatch Logs
• Many customers use self-managed Kafka with Beats for shipping
Amazon EC2/
Kinesis Agent/
Beats/
etc.
Amazon Elasticsearch
Service
Amazon Managed
Streaming for Kafka
Amazon Kinesis
Data Streams
Amazon EMR
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Metrics
Logs
Container monitoring
• App containers, pods, system, nodes
• Kubernetes events, unavailable pods
• Application logs & metrics
• System logs & metrics
• Cluster capacity, performance, network traffic
• Ad hoc analysis & troubleshooting
VPC
Amazon Elastic Container
Service for Kubernetes
Metricbeat
Fluentd
Elastic network
interface
Elastic network
interface
Elastic network
interface
VPC
Amazon Elasticsearch
Service
Health & performance monitoring
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Log analytics flow
Data producers Collection
Permanent
cold storage
Transformation
Analysis and
reporting
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Collection on the hosts under study
• Simple setup
• Tail log files
• Forward to Kinesis
Data Streams or
Kinesis Data
Firehose
• More complex
• Transform or
enrich data before
shipping
• Send to Kinesis
Data Streams
• Lightweight log
shipper
• Send to Kinesis
Data Firehose
Kinesis Agent KPL Beats
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Create time-based indexes for log analytics
• You use a root string (e.g. logs_).
• Depending on volume, rotate at regular
intervals—normally daily.
• Daily indexes simplify index management.
Delete the oldest index to create more
space on your cluster.
logs_11.26.2018
logs_11.25.2018
logs_11.24.2018
logs_11.23.2018
logs_11.22.2018
logs_11.21.2018
logs_11.20.2018
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use the query APIs to retrieve data from
Elasticsearch
Amazon ES Domain
Query
engine
Scoring/
sorting
Ranked
resultsMatches
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The query engine matches requested field values
Field1:value1
Field2:value2
logs_11.28.2018 index
F1 index F2 index
V1
V2
Vn
V1
V2
Vn
ID
Field: value
Field: value
Field: value
Field: value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You use aggregations to analyze log data
Amazon ES Domain
Query
engine
Matches
Analysis
engine
(aggregations)
• Histogram
• Numeric – sum,
min., max.
• Terms – bucketing
• Nesting
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Kibana is a real-time visualization tool
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Security
You interact with an endpoint – DNS
Amazon Cognito for authentication and IDPs
Amazon Virtual Private Cloud (Amazon VPC) for restricting to your IP address space
AWS Identity and Access Management (IAM) to control Elasticsearch and Amazon ES APIs
AWS Key Management Service (AWS KMS) for encryption at rest
TLS for encryption in flight
Instance
storage
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Data is stored in indexes, distributed across shards
All docs
1/51/5 1/5 1/5 1/5
Index
ID
Field: value
Field: value
Field: value
Field: value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Shards are primary or replica
Index
Primary shards
Replica shards
ID
Field: value
Field: value
Field: value
Field: value
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Elasticsearch distributes shards to data nodes
Queries
Updates
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
You set the basic parameters that control scale
Deploy instances based on
storage and compute needs
Add instances for increased
parallelism
Instance type Instance count
Index data (primary and replica
shards) is stored on disk
Shards are the units of work
and storage
Storage Shard count
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Scaling
Storage needed =
Source/day * 1.1 * 2 * 7 * 1.15
• For example: 1 TB daily of source
data needs 17 TB of storage for 7
days
Number of CPUs ~=
1.6 * active shards
• For example: 4 data streams @ 1
TB daily means 40 total shards
(20 primary and 20 replica)
active, so make sure to have 64
total CPUs
• This is almost certainly wrong
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Collection and buffering limit concurrency
• One Kinesis Data Firehose with
a 5 MB buffer
• 5000 1K records per
connection
• Number of connections driven
by flush rate
• Agents on the hosts
connecting directly
• 5000 connections per 5000 1K
records
• Buffering on the hosts helps,
connections driven by host
count
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
T-shirt sizes
T-shirt
size
Data
(per day)
Storage
needed
Active shards
(maximum)
Total shards
(maximum)
Instances
XSmall 10 GB 177 GB 4 300 2x M5/R5.large data
Small 100 GB 1.7 TB 8 600 4x M5/R5.xlarge data
Medium 500 GB 8.5 TB 30 3000 6x I3.2xlarge data
Large 1 TB 17.7 TB 60 3000 6x I3.4xlarge data
XLarge 10 TB 177.1 TB 600 5,000 30x I3.8xlarge data
Huge 80 TB 1.288 PB 3400 25,000 85x I3.16xlarge data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon ES additional tasks
Deployment Security
• Monitoring metrics
via Amazon
CloudWatch
• Audit via AWS
CloudTrail
• One-click version
upgrades
• One-click security
patches
• Backups
Operations/governance
• Deploy in minutes
• Scale seamlessly
• Node roles: data and
master
• High availability
• Kibana included
• Private networking
• Encryption at rest
and in flight
• HIPAA and PCI
compliance
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Security information and event management (SIEM)
IoT & mobile
Application monitoring & root-cause analysis
Business and web analytics
Amazon ES empowers you with the data to
understand and intelligently react to your business
needs
• End-to-end visibility – Better understanding of
customers’ behavior to improve user experience
and react to demand
• Improve reliability – Increased operational
efficiencies by identifying, solving, and preventing
system failures in real time
• Faster time to value – Accelerate time to market
with application delivery and performance
monitoring
• Security - Improved business confidence with end-
to-end monitoring of data, infrastructure, and
transactions
Build actionable insights
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Central log management system
https://www.youtube.com/watch?v=fSjAfp-uqSs
Case study: Autodesk
Highly distributed organization—no consistent way to
collect and measure metrics.
Small ops team.
Must integrate easily with other AWS services.
Scale – Accommodate current and future requirements.
Must be cost effective with no data lock-in.
TBs of log data to sift through to find and fix customer-
impacting issues.
C h a l l e n g e
B E N E F I T S
Unified log data management solution built on AWS. Single interface
for log analytics across applications. Annotate log records to enable
distributed tracing states.
Streaming application logs via Kinesis Data Firehose to Amazon
S3/Amazon Athena and Amazon ES
10 i3.4xlarge Amazon ES data nodes – 33 TB. Will grow to 110 TB
Kibana, built in within Amazon ES, for near-real-time analytics and
dashboards
S o l u t i o n
All managed services—“manage less to gain more.” Focus on developing awesome products.
Common vocabulary for diagnosing and solving problems. Eliminated silos.
Scalable and cost-effective—i3s delivering great value per TB.
Improving customer experience by reducing the time to find and fix customer issues.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Wrap-up
• Machine-generated data is growing rapidly, driven by Dev Ops, cloud infrastructure,
and IoT
• Logs contain valuable insights—what your users are doing, whether you have bad
actors, what’s happening at your devices
• Amazon ES enables ingesting and analyzing logs in real time to provide you with the
data and insights you need
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Carl Meadows
carlmead@amazon.com

Mais conteúdo relacionado

Mais procurados

Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Amazon Web Services
 
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...Amazon Web Services
 
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...Amazon Web Services
 
What's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS SummitWhat's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS SummitAmazon Web Services
 
Increase the value of video with machine learning & AWS Media Services - SVC3...
Increase the value of video with machine learning & AWS Media Services - SVC3...Increase the value of video with machine learning & AWS Media Services - SVC3...
Increase the value of video with machine learning & AWS Media Services - SVC3...Amazon Web Services
 
Amazon SageMaker Build, Train and Deploy Your ML Models
Amazon SageMaker Build, Train and Deploy Your ML ModelsAmazon SageMaker Build, Train and Deploy Your ML Models
Amazon SageMaker Build, Train and Deploy Your ML ModelsAWS Riyadh User Group
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSAmazon Web Services
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Amazon Web Services
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudResiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudAmazon Web Services
 
Soluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web ServicesSoluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web ServicesAmazon Web Services
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...Amazon Web Services
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Amazon Web Services
 
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ... No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...AWS Summits
 
High-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-SimulationHigh-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-SimulationAmazon Web Services
 
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS Summit
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS SummitModernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS Summit
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS SummitAmazon Web Services
 
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...Amazon Web Services
 
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS SummitIntroducing Open Distro for Elasticsearch - ADB201 - New York AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS SummitAmazon Web Services
 
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAsk me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAmazon Web Services
 

Mais procurados (20)

Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...
Modernize your data warehouse with Amazon Redshift - ADB305 - New York AWS Su...
 
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
Running Amazon Elastic Compute Cloud (Amazon EC2) workloads at scale - CMP202...
 
What's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS SummitWhat's new in Amazon Aurora - ADB207 - New York AWS Summit
What's new in Amazon Aurora - ADB207 - New York AWS Summit
 
Increase the value of video with machine learning & AWS Media Services - SVC3...
Increase the value of video with machine learning & AWS Media Services - SVC3...Increase the value of video with machine learning & AWS Media Services - SVC3...
Increase the value of video with machine learning & AWS Media Services - SVC3...
 
Amazon SageMaker Build, Train and Deploy Your ML Models
Amazon SageMaker Build, Train and Deploy Your ML ModelsAmazon SageMaker Build, Train and Deploy Your ML Models
Amazon SageMaker Build, Train and Deploy Your ML Models
 
Best-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWSBest-Practices-for-Running-Windows-Workloads-on-AWS
Best-Practices-for-Running-Windows-Workloads-on-AWS
 
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
Analyzing and processing streaming data with Amazon EMR - ADB204 - New York A...
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-CloudResiliency-and-Availability-Design-Patterns-for-the-Cloud
Resiliency-and-Availability-Design-Patterns-for-the-Cloud
 
Soluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web ServicesSoluzioni per la migrazione e gestione dei dati in Amazon Web Services
Soluzioni per la migrazione e gestione dei dati in Amazon Web Services
 
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
What’s new with Amazon Redshift, featuring ZS Associates - ADB205 - Chicago A...
 
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
Optimize your workloads with Amazon EC2 and AMD EPYC - DEM03-SR - New York AW...
 
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ... No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
No Hassle NoSQL - Amazon DynamoDB & Amazon DocumentDB | AWS Summit Tel Aviv ...
 
High-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-SimulationHigh-Performance-Computing-on-AWS-and-Industry-Simulation
High-Performance-Computing-on-AWS-and-Industry-Simulation
 
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS Summit
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS SummitModernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS Summit
Modernize your data warehouse with Amazon Redshift - ADB305 - Atlanta AWS Summit
 
Serverless_with_MongoDB
Serverless_with_MongoDBServerless_with_MongoDB
Serverless_with_MongoDB
 
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...
Amazon EC2 A1 instances, powered by the AWS Graviton processor - CMP303 - San...
 
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS SummitIntroducing Open Distro for Elasticsearch - ADB201 - New York AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - New York AWS Summit
 
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS SummitAsk me anything about building data lakes on AWS - ADB209 - New York AWS Summit
Ask me anything about building data lakes on AWS - ADB209 - New York AWS Summit
 

Semelhante a AWS Elasticsearch Service for Log Analytics

Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitAmazon Web Services
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Amazon Web Services
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...AWS Summits
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Amazon Web Services
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time dataAmazon Web Services
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayAmazon Web Services
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統Amazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Amazon Web Services
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...AWS Riyadh User Group
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSAmazon Web Services
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWSAWS Germany
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitBuilding Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitAmazon Web Services
 
Architetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeArchitetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeAmazon Web Services
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSAmazon Web Services
 
AWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempiAWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempiAmazon Web Services
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitAmazon Web Services
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSFlink Forward
 

Semelhante a AWS Elasticsearch Service for Log Analytics (20)

Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
Need for Speed – Intro To Real-Time Data Streaming Analytics on AWS | AWS Sum...
 
Stream processing and managing real-time data
Stream processing and managing real-time dataStream processing and managing real-time data
Stream processing and managing real-time data
 
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per DayCyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
Cyber Data Lake: How CIS Analyzes Billions of Network Traffic Records per Day
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Building Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWSBuilding Data Lakes for Analytics on AWS
Building Data Lakes for Analytics on AWS
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS SummitBuilding Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
Building Data Lakes for Analytics on AWS - ADB201 - Anaheim AWS Summit
 
Architetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo realeArchitetture per l'analisi di flussi di dati in tempo reale
Architetture per l'analisi di flussi di dati in tempo reale
 
Building-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWSBuilding-Serverless-Analytics-On-AWS
Building-Serverless-Analytics-On-AWS
 
AWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempiAWS Portfolio: highlight delle categorie di prodotti AWS con esempi
AWS Portfolio: highlight delle categorie di prodotti AWS con esempi
 
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS SummitOptimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
Optimize data lakes with Amazon S3 - STG302 - Santa Clara AWS Summit
 
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWSKeynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
Keynote: Customer Journey with Streaming Data on AWS - Rahul Pathak, AWS
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

AWS Elasticsearch Service for Log Analytics

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scalable, secure log analytics with Amazon ES Carl Meadows Principal Product Manager AWS – Search Services A D B 3 0 2
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Source: TechCrunch survey of popular open source software from April ’17 • Sometimes referred to as the “ELK Stack” – Elasticsearch, Logstash & Kibana • Distributed search and analytics engine built on Apache Lucene • Easy ingestion and visualization What is Elasticsearch?
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Machine data driving Elasticsearch growth Machine-generated data is growing 10x faster than business data… Logs, logs, and more logs IT & DevOps: Databases, servers, storage, networking Increase in IoT and mobile devices: Gaming, sensors, web content Cloud-based architectures Source: insideBigData, “The Exponential Growth of Data,” February 16, 2017
  • 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Application DataServer, application, network, AWS, and other logs Amazon Elasticsearch Service domain with index(es) How it works 1. Send data as JSON via REST APIs 2. Data is indexed—all fields searchable, including nested JSON 3. Queries, via REST APIs, allow fielded matching, Boolean expressions, include sorting and analysis 1 2 3 Application users, analysts, DevOps, security
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Elasticsearch Service is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana in the AWS Cloud Amazon Elasticsearch Service
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T AWS Cloud Elasticsearch runs on a cluster of instances VPC Data nodes Master nodes AWS Management Console AWS Command Line Interface (AWS CLI) AWS tools and SDKs AWS CloudFormation AWS Identity and Access Management (IAM) Elastic Load Balancing (ELB) AWS CloudTrailAmazon CloudWatch AWS Database Migration Service Amazon Kinesis Data Firehose Amazon CloudWatch Logs Amazon ES domain
  • 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Infrastructure monitoring: OS transport • ELK: Elasticsearch, Logstash, and Kibana • Beats: lightweight log shippers • For high-volume writes, use a buffer—Redis in this case Logstash on Amazon EC2 Amazon EC2 Amazon ElastiCache for Redis Amazon Elasticsearch Service
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Infrastructure monitoring/SIEM: AWS-centric • Employing streaming technologies at higher volumes • Some customers use Amazon CloudWatch Logs • Many customers use self-managed Kafka with Beats for shipping Amazon EC2/ Kinesis Agent/ Beats/ etc. Amazon Elasticsearch Service Amazon Managed Streaming for Kafka Amazon Kinesis Data Streams Amazon EMR
  • 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Metrics Logs Container monitoring • App containers, pods, system, nodes • Kubernetes events, unavailable pods • Application logs & metrics • System logs & metrics • Cluster capacity, performance, network traffic • Ad hoc analysis & troubleshooting VPC Amazon Elastic Container Service for Kubernetes Metricbeat Fluentd Elastic network interface Elastic network interface Elastic network interface VPC Amazon Elasticsearch Service Health & performance monitoring
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log analytics flow Data producers Collection Permanent cold storage Transformation Analysis and reporting
  • 11. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Collection on the hosts under study • Simple setup • Tail log files • Forward to Kinesis Data Streams or Kinesis Data Firehose • More complex • Transform or enrich data before shipping • Send to Kinesis Data Streams • Lightweight log shipper • Send to Kinesis Data Firehose Kinesis Agent KPL Beats
  • 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Create time-based indexes for log analytics • You use a root string (e.g. logs_). • Depending on volume, rotate at regular intervals—normally daily. • Daily indexes simplify index management. Delete the oldest index to create more space on your cluster. logs_11.26.2018 logs_11.25.2018 logs_11.24.2018 logs_11.23.2018 logs_11.22.2018 logs_11.21.2018 logs_11.20.2018
  • 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use the query APIs to retrieve data from Elasticsearch Amazon ES Domain Query engine Scoring/ sorting Ranked resultsMatches
  • 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The query engine matches requested field values Field1:value1 Field2:value2 logs_11.28.2018 index F1 index F2 index V1 V2 Vn V1 V2 Vn ID Field: value Field: value Field: value Field: value
  • 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use aggregations to analyze log data Amazon ES Domain Query engine Matches Analysis engine (aggregations) • Histogram • Numeric – sum, min., max. • Terms – bucketing • Nesting
  • 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Kibana is a real-time visualization tool
  • 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Security You interact with an endpoint – DNS Amazon Cognito for authentication and IDPs Amazon Virtual Private Cloud (Amazon VPC) for restricting to your IP address space AWS Identity and Access Management (IAM) to control Elasticsearch and Amazon ES APIs AWS Key Management Service (AWS KMS) for encryption at rest TLS for encryption in flight Instance storage
  • 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data is stored in indexes, distributed across shards All docs 1/51/5 1/5 1/5 1/5 Index ID Field: value Field: value Field: value Field: value
  • 19. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Shards are primary or replica Index Primary shards Replica shards ID Field: value Field: value Field: value Field: value
  • 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Elasticsearch distributes shards to data nodes Queries Updates
  • 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You set the basic parameters that control scale Deploy instances based on storage and compute needs Add instances for increased parallelism Instance type Instance count Index data (primary and replica shards) is stored on disk Shards are the units of work and storage Storage Shard count
  • 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Scaling Storage needed = Source/day * 1.1 * 2 * 7 * 1.15 • For example: 1 TB daily of source data needs 17 TB of storage for 7 days Number of CPUs ~= 1.6 * active shards • For example: 4 data streams @ 1 TB daily means 40 total shards (20 primary and 20 replica) active, so make sure to have 64 total CPUs • This is almost certainly wrong
  • 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Collection and buffering limit concurrency • One Kinesis Data Firehose with a 5 MB buffer • 5000 1K records per connection • Number of connections driven by flush rate • Agents on the hosts connecting directly • 5000 connections per 5000 1K records • Buffering on the hosts helps, connections driven by host count
  • 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T T-shirt sizes T-shirt size Data (per day) Storage needed Active shards (maximum) Total shards (maximum) Instances XSmall 10 GB 177 GB 4 300 2x M5/R5.large data Small 100 GB 1.7 TB 8 600 4x M5/R5.xlarge data Medium 500 GB 8.5 TB 30 3000 6x I3.2xlarge data Large 1 TB 17.7 TB 60 3000 6x I3.4xlarge data XLarge 10 TB 177.1 TB 600 5,000 30x I3.8xlarge data Huge 80 TB 1.288 PB 3400 25,000 85x I3.16xlarge data
  • 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon ES additional tasks Deployment Security • Monitoring metrics via Amazon CloudWatch • Audit via AWS CloudTrail • One-click version upgrades • One-click security patches • Backups Operations/governance • Deploy in minutes • Scale seamlessly • Node roles: data and master • High availability • Kibana included • Private networking • Encryption at rest and in flight • HIPAA and PCI compliance
  • 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Security information and event management (SIEM) IoT & mobile Application monitoring & root-cause analysis Business and web analytics Amazon ES empowers you with the data to understand and intelligently react to your business needs • End-to-end visibility – Better understanding of customers’ behavior to improve user experience and react to demand • Improve reliability – Increased operational efficiencies by identifying, solving, and preventing system failures in real time • Faster time to value – Accelerate time to market with application delivery and performance monitoring • Security - Improved business confidence with end- to-end monitoring of data, infrastructure, and transactions Build actionable insights
  • 27. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Central log management system https://www.youtube.com/watch?v=fSjAfp-uqSs Case study: Autodesk Highly distributed organization—no consistent way to collect and measure metrics. Small ops team. Must integrate easily with other AWS services. Scale – Accommodate current and future requirements. Must be cost effective with no data lock-in. TBs of log data to sift through to find and fix customer- impacting issues. C h a l l e n g e B E N E F I T S Unified log data management solution built on AWS. Single interface for log analytics across applications. Annotate log records to enable distributed tracing states. Streaming application logs via Kinesis Data Firehose to Amazon S3/Amazon Athena and Amazon ES 10 i3.4xlarge Amazon ES data nodes – 33 TB. Will grow to 110 TB Kibana, built in within Amazon ES, for near-real-time analytics and dashboards S o l u t i o n All managed services—“manage less to gain more.” Focus on developing awesome products. Common vocabulary for diagnosing and solving problems. Eliminated silos. Scalable and cost-effective—i3s delivering great value per TB. Improving customer experience by reducing the time to find and fix customer issues.
  • 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Wrap-up • Machine-generated data is growing rapidly, driven by Dev Ops, cloud infrastructure, and IoT • Logs contain valuable insights—what your users are doing, whether you have bad actors, what’s happening at your devices • Amazon ES enables ingesting and analyzing logs in real time to provide you with the data and insights you need
  • 29. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Carl Meadows carlmead@amazon.com