SlideShare uma empresa Scribd logo
1 de 46
Log Analytics with AWS
Lex Crosett
crosettl@amazon.com
Solutions Architect
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved
David Simcik
simcik@amazon.com
Solutions Architect
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Your pager goes off
at 3:00am.
The servers are
melting!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
You VPN in and quick check the logs
Uh oh
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
If you had Amazon Elasticsearch Service
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Transition from IT
to DevOps
Increase in IoT and
Mobile Devices
Cloud-based
architectures
Machine-generated data is growing 10x faster than business data
Source: insideBigData - The Exponential Growth of Data, February 16, 2018
THE EXPLOSION OF MACHINE-GENERATED DATA
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Elasticsearch Service is a
fully managed service that makes
it easy to deploy, manage, and
scale Elasticsearch and Kibana
A M A Z O N E L A S T IC S E AR C H S E R V IC E
+
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
B E N E F I T S OF AMAZON EL AS T I C S EAR C H S ER V I C E
Supports Open-Source
APIs and Tools
Drop-in replacement with no
need to learn new APIs or skills
Easy to Use
Deploy a production-ready
Elasticsearch cluster in
minutes
Scalable
Resize your cluster with a few
clicks or a single API call
Secure
Deploy into your VPC and restrict
access using security groups and
IAM policies
Highly Available
Replicate across Availability
Zones, with monitoring and
automated self-healing
Tightly Integrated with
Other AWS Services
Seamless data ingestion, security,
auditing and orchestration
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ELASTICSEARCH LEADING USE CASES
Application Monitoring &
Root-cause Analysis
Security Information and Event
Management (SIEM)
IoT & Mobile Business & Clickstream Analytics
Provides developers with a high
performance, self-service operational
monitoring and analytics platform
Enables security practitioners to centralize
and analyze events from across the entire
organization
Gives developers and lines of business
users real-time location-aware insights
into their device fleets
Provides business users with a real-time view
of the performance of their web content and
e-commerce platforms
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Application Monitoring & Root-cause Analysis
C A S E S T U D Y : EXPEDI A
Logs, lots and lots of logs. How to cost effectively monitor logs?
Require centralized logging infrastructure
Did not have the man power to manage infrastructure
P R O B L E M
Quick insights: Able to identify and troubleshoot issues in real-time
Secure: Integrated w/ AWS IAM
Scalable: Cluster sizes are able to grow to accommodate additional log sources
B E N E F I T S
Streaming AWS CloudTrail logs, application logs, and Docker
startup logs to Elasticsearch
Created centralized logging service for all team members
Using Kibana for visualizations and for Elasticsearch queries
S O L U T I O N
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Business and Clickstream Analytics
C A S E S T U D Y : FI NANC I AL T I M ES
What stories do our readers care about? What’s hot?
Required a custom clickstream analytics solution
Need a solution that delivers analytics in real-time
Did not have a team to manage analytics infrastructure
P R O B L E M
B E N E F I T S
Streaming user data to Amazon ES for analysis. Created their own
custom dashboards for editors and journalists – Lantern.
Lantern - ”shines a light” on reader activity for the editors and
journalists at the FT
Critical tool for making editorial decisions. Daily editorial meetings
start by looking at Lantern dashboard
S O L U T I O N
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reliability : Lantern is used throughout the day by journalists and editors. Relying on Amazon to manage
their systems for maximum uptime.
Cost savings: Able to easily tune their cluster to meet their needs with minimal management overhead
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Software & Internet Financial ServicesEducation Technology BioTech and Pharma
Media and Entertainment Social Media Telecommunications Travel & Transportation
Real Estate Logistics & Operations Publishing Other
AMAZO N EL AS T I C S EAR C H S ER V I C E C US T O MER S
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch is a distributed solution
You run
clusters of
instances
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Service architecture
AWS SDK
AWS CLI
AWS CloudFormation
Elasticsearch
data nodes
Elasticsearch
master nodes
Elastic Load
Balancing
IAM
Amazon
CloudWatch
AWS
CloudTrail
Amazon Elasticsearch Service domain
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How many instances?
• The index size will be about the same as the
corpus of source documents
• Double this if you are deploying an index replica
• Size based on storage requirements
• Either local storage or up to 1.5 TB of Amazon
Elastic Block Store (EBS) per instance
Example: a 2 TB corpus will need 4 instances
Assuming a replica and using EBS
Given 1.5 TB of storage per instance, this gives 6TB of storage
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which instance type?
Instance Max Storage* Workload
T2 3.5TB You want to do dev/QA
M3, M4 150TB Your data and queries are “average”
R3, R4 150TB You have higher request volumes, larger
documents, or are using aggregations heavily
C4 150TB You need to support high concurrency
I2, I3 1.5 PB You have XL storage needs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How you shard is
how you scale
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch documents: structured JSON
{ "verb": "GET",
"ident": "-",
"bytes": 6245,
"@timestamp": "1995-07-01T00:00:01",
"request":
"GET /history/apollo/ HTTP/1.0",
"host": "199.72.81.55",
"authuser": "-",
"@timestamp_utc":
"1995-07-01T04:00:01+00:00",
"timezone": "-0400",
"response": 200 }
• Documents contain fields –
name/value pairs
• Fields can nest
• Value types include text,
numerics, dates, and geo objects
• Field values can be single or array
• When you send documents to
Elasticsearch they should arrive
as JSON
199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Documents are the core entity
ID
Field: value
Field: value
Field: value
Field: value
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ES stores documents in an index
Amazon Elasticsearch Service domain
ID
Field: value
Field: value
Field: value
Field: value
_bulk API
Index / Type
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Indexes have primary shards
Amazon Elasticsearch Service domain
ID
Field: value
Field: value
Field: value
Field: value
_bulk API
Index / Type
Shards
• Shards are the units of
storage and compute in
an Amazon ES domain
• Each shard’s set of
documents is distinct
• The primary shard count
is fixed at creation time
• Shards are instances of
Apache Lucene
• Lucene creates an index
for each field in each
document
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Indexes have replica shards
Amazon Elasticsearch Service domain
ID
Field: value
Field: value
Field: value
Field: value
_bulk API
Index / Type
Shards
• Replica shards are
distributed copies of
their primaries
• The replica shard count
is dynamic
• Replica shards provide
data redundancy and
parallelism
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Shards are distributed across instances
Amazon Elasticsearch Service domain
ID
Field: value
Field: value
Field: value
Field: value
_bulk API
Index / Type
Shards
Instances
• Primary and replica
are distributed to
different instances
• Distribution is initially
by shard count
• Storage available
limits new shard
allocation as well
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Shards (Lucene) store indexes on disk
Amazon Elasticsearch Service domain
ID
Field: value
Field: value
Field: value
Field: value
_bulk API
Index / Type
Shards
Instances Storage
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How many shards?
• Best practice: shards should be < 50GB
• Divide index size by ~40GB to get initial
shard count
• Active shards per instance ~= vCPUs
• Always use at least 1 replica for production!
Example: 2 TB corpus will need 50 primary shards
2,000 GB / 40GB per shard = 50 shards
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data pattern
Amazon ES cluster
logs_01.21.2018
logs_01.22.2018
logs_01.23.2018
logs_01.24.2018
logs_01.25.2018
logs_01.26.2018
logs_01.27.2018
Shard 1
Shard 2
Shard 3
host
request
verb
status
timestamp
etc.
Each index has
multiple shards
Each shard contains
a set of documents
Each document contains
a set of fields and values
One index per day
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Load data into
Amazon ES
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The components of ingestion
Data source Collect Transform Buffer Deliver
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS-Centric solutions
Amazon Kinesis
Firehose
Amazon CloudWatch Logs
Logstash
AWS IoT
Elasticsearch
data nodes
Elasticsearch
master nodes
Kibana
Data Producers Buffer/
Transform/
Deliver
Amazon ES Cluster Analytics UI
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Lambda architectures
Files
S3 Events
Amazon S3 AWS Lambda
Function
DynamoDB
streams
Amazon
DynamoDB
Table
AWS Lambda
Function
Amazon Elasticsearch Service
Amazon KinesisData
Producers
AWS Lambda
Function
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Kinesis Firehose delivery architecture
• For public access
domains
• Easily transform
data
• Serverless with
built-in batching,
index rollover, error
handling
S3 bucket
source records
data source
source records
Amazon Elasticsearch
Service
Firehose
delivery stream
transformed
records
delivery failure
Data transformation
function
transformation failure
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Secure your data
• Publ i c – IAM
• Pri vate – end p o i nt
• E ncrypti on
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use IAM for public endpoints
{
"Version": "2012-10-17",
"Statement": [ {
"Effect":...
"Principal": ...
"Action": [...],
"Resource": ...,
"Condition": ...
} ]
}
• Effect: Allow or Deny
• Principal: AWS account ID
• Action
• HTTP verbs
• Service actions
• Resource: Amazon ES
domain/index
• Condition: IP Address
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use security groups for private endpoints
• Private networking between
your VPC and Amazon
Elasticsearch Service
• Traffic does not
traverse the public
internet
• Use IAM security groups for
authentication and access
control
VPC subnet
security group
VPC subnet
security group
Amazon Elasticsearch Service
Data Master
Data
Master
IAM
IAM
Availability
Zone B
Availability
Zone A
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Encrypt your data
Elasticsearch
data nodes
Elasticsearch
master nodes
AWSKMS
• Use your own KMS keys
• Encrypted data at rest on
Amazon ES instances
• Encrypted automatic
snapshots
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Make your domain
more stable
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use dedicated masters
Amazon ES cluster
Dedicated master nodes:
cluster state Data nodes: queries and updates
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Master node recommendations
Number of data nodes Master node instance type
< 10 m3.medium+
< 20 m4.large+
<= 50 c4.xlarge+
50-100 c4.2xlarge+
• In production, always use an odd number of masters, >= 3
• Use fewer, smaller than data nodes
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use zone awareness
Amazon ES cluster
Availability Zone 1 Availability Zone 2
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
CloudWatch Alarms
Name Metric Threshold Periods
ClusterStatus.red Maximum >= 1 1
ClusterIndexWritesBlocked Maximum >= 1 1
CPUUtilization/MasterCPUUtilization Average >= 80% 3
JVMMemoryPressure/Master... Maximum >= 80% 3
FreeStorageSpace Minimum <= (25% of
avail space)
1
AutomatedSnapshotFailure Maximum >= 1 1
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Slow Logs
• Easy console set up
• Integrated with CloudWatch Logs
• Set thresholds to receive log
events corresponding to slow
queries and slow indexing
• index.search.slowlog.threshold.query.warn
• index.search.slowlog.threshold.query.info
• index.search.slowlog.threshold.query.debug
• index.search.slowlog.threshold.query.trace
• index.search.slowlog.threshold.fetch.warn
• index.search.slowlog.threshold.fetch.info
• index.search.slowlog.threshold.fetch.debug
• index.search.slowlog.threshold.fetch.trace
• index.indexing.slowlog.threshold.index.warn
• index.indexing.slowlog.threshold.index.info
• index.indexing.slowlog.threshold.index.debug
• index.indexing.slowlog.threshold.index.trace
• index.indexing.slowlog.level: trace
• index.indexing.slowlog.source: 255
Amazon ES Domain CloudWatch Logs
Queries and
Updates
Slow query logs
Slow indexing
logs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Elasticsearch
analysis is
delivered by
Aggregations
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Think buckets and metrics
Buckets – a collection of documents meeting some criterion
Metrics – calculations on the content of buckets
Bucket: time
Metric:count
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dig in to your data in real time
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Run Elasticsearch in the AWS Cloud with
Amazon Elasticsearch Service
Deploy, scale, ingest, secure, monitor, and
analyze
Start sending your log data today!Amazon
Elasticsearch
Service
Questions?

Mais conteúdo relacionado

Mais procurados

What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018Amazon Web Services
 
Securing Your Big Data Workload - AWS Summit Sydney 2018
Securing Your Big Data Workload - AWS Summit Sydney 2018Securing Your Big Data Workload - AWS Summit Sydney 2018
Securing Your Big Data Workload - AWS Summit Sydney 2018Amazon Web Services
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudAmazon Web Services
 
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...Amazon Web Services
 
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Amazon Web Services
 
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...Amazon Web Services
 
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...Amazon Web Services
 
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...Amazon Web Services
 
Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Amazon Web Services
 
Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass Amazon Web Services
 
Big Data and Alexa_Voice-Enabled Analytics
Big Data and Alexa_Voice-Enabled Analytics Big Data and Alexa_Voice-Enabled Analytics
Big Data and Alexa_Voice-Enabled Analytics Amazon Web Services
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Amazon Web Services
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSAmazon Web Services
 
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Amazon Web Services
 
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...Amazon Web Services
 
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...Amazon Web Services
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Amazon Web Services
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Amazon Web Services
 

Mais procurados (20)

What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
What Can Your Logs Tell You? (ANT215) - AWS re:Invent 2018
 
Securing Your Big Data Workload - AWS Summit Sydney 2018
Securing Your Big Data Workload - AWS Summit Sydney 2018Securing Your Big Data Workload - AWS Summit Sydney 2018
Securing Your Big Data Workload - AWS Summit Sydney 2018
 
Building a Modern Data Platform in the Cloud
Building a Modern Data Platform in the CloudBuilding a Modern Data Platform in the Cloud
Building a Modern Data Platform in the Cloud
 
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...
Hybrid Cloud Processing & Data Distribution with File Gateway & Amazon S3 (ST...
 
Customer Uses of Data Lakes
Customer Uses of Data LakesCustomer Uses of Data Lakes
Customer Uses of Data Lakes
 
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
Build Data Engineering Platforms with Amazon EMR (ANT204) - AWS re:Invent 2018
 
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
Implementing Multi-Region AWS IoT, ft. Analog Devices (IOT401) - AWS re:Inven...
 
Preparing Data for the Lake
Preparing Data for the LakePreparing Data for the Lake
Preparing Data for the Lake
 
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...
Save up to 90% on Big Data and Machine Learning Workloads with Spot Instances...
 
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...
Lock It Down: Configure End-to-End Security & Access Control on Amazon EMR (A...
 
Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML Build Intelligent Apps with Amazon ML
Build Intelligent Apps with Amazon ML
 
Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass Edge Computing with AWS Greengrass
Edge Computing with AWS Greengrass
 
Big Data and Alexa_Voice-Enabled Analytics
Big Data and Alexa_Voice-Enabled Analytics Big Data and Alexa_Voice-Enabled Analytics
Big Data and Alexa_Voice-Enabled Analytics
 
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
Data Lake Implementation: Processing and Querying Data in Place (STG204-R1) -...
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
Power up Your AWS Data Lake and Warehouse with Trusted Data (Sponsored by Tal...
 
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
AI/ML with Data Lakes: Counterintuitive Consumer Insights in Retail (RET206) ...
 
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...
Amazon Athena: What's New and How SendGrid Innovates (ANT324) - AWS re:Invent...
 
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
Extending Analytics Beyond the Data Warehouse, ft. Warner Bros. Analytics (AN...
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 

Semelhante a Log Analytics with AWS

Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczAmazon Web Services
 
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceAmazon Web Services
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAmazon Web Services
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SFAmazon Web Services
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Amazon Web Services
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeAmazon Web Services
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Amazon Web Services
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Amazon Web Services
 
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Amazon Web Services
 
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...Amazon Web Services
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfAmazon Web Services
 
Scaling from zero to millions of users
Scaling from zero to millions of usersScaling from zero to millions of users
Scaling from zero to millions of usersAmazon Web Services
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersAmazon Web Services
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitAmazon Web Services
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopAmazon Web Services
 
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018Amazon Web Services
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...AWS Riyadh User Group
 
AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAmazon Web Services
 

Semelhante a Log Analytics with AWS (20)

Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Using Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter DachnowiczUsing Search with a Database - Peter Dachnowicz
Using Search with a Database - Peter Dachnowicz
 
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA308 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
Adding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San FranciscoAdding Search to DynamoDB: Database Week San Francisco
Adding Search to DynamoDB: Database Week San Francisco
 
Using Search with a Database: Database Week SF
Using Search with a Database: Database Week SFUsing Search with a Database: Database Week SF
Using Search with a Database: Database Week SF
 
Using Search with a Database
Using Search with a DatabaseUsing Search with a Database
Using Search with a Database
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Big Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_SingaporeBig Data@Scale_AWSPSSummit_Singapore
Big Data@Scale_AWSPSSummit_Singapore
 
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS...
 
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
Serverless Stream Processing Pipeline Best Practices (SRV316-R1) - AWS re:Inv...
 
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
Have Your Front End and Monitor It, Too (ANT303) - AWS re:Invent 2018
 
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...
Integrating Amazon Elasticsearch with your DevOps Tooling - AWS Online Tech T...
 
Implementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdfImplementazione di una soluzione Data Lake.pdf
Implementazione di una soluzione Data Lake.pdf
 
Scaling from zero to millions of users
Scaling from zero to millions of usersScaling from zero to millions of users
Scaling from zero to millions of users
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M Users
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
 
WildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing WorkshopWildRydes Serverless Data Processing Workshop
WildRydes Serverless Data Processing Workshop
 
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018
Scaling Up to Your First 10 Million Users (ARC205-R1) - AWS re:Invent 2018
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
AWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scaleAWS Data Lake: data analysis @ scale
AWS Data Lake: data analysis @ scale
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Log Analytics with AWS

  • 1. Log Analytics with AWS Lex Crosett crosettl@amazon.com Solutions Architect © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved David Simcik simcik@amazon.com Solutions Architect
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Your pager goes off at 3:00am. The servers are melting!
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. You VPN in and quick check the logs Uh oh
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. If you had Amazon Elasticsearch Service
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Transition from IT to DevOps Increase in IoT and Mobile Devices Cloud-based architectures Machine-generated data is growing 10x faster than business data Source: insideBigData - The Exponential Growth of Data, February 16, 2018 THE EXPLOSION OF MACHINE-GENERATED DATA
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Elasticsearch Service is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana A M A Z O N E L A S T IC S E AR C H S E R V IC E +
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. B E N E F I T S OF AMAZON EL AS T I C S EAR C H S ER V I C E Supports Open-Source APIs and Tools Drop-in replacement with no need to learn new APIs or skills Easy to Use Deploy a production-ready Elasticsearch cluster in minutes Scalable Resize your cluster with a few clicks or a single API call Secure Deploy into your VPC and restrict access using security groups and IAM policies Highly Available Replicate across Availability Zones, with monitoring and automated self-healing Tightly Integrated with Other AWS Services Seamless data ingestion, security, auditing and orchestration
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. ELASTICSEARCH LEADING USE CASES Application Monitoring & Root-cause Analysis Security Information and Event Management (SIEM) IoT & Mobile Business & Clickstream Analytics Provides developers with a high performance, self-service operational monitoring and analytics platform Enables security practitioners to centralize and analyze events from across the entire organization Gives developers and lines of business users real-time location-aware insights into their device fleets Provides business users with a real-time view of the performance of their web content and e-commerce platforms
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Application Monitoring & Root-cause Analysis C A S E S T U D Y : EXPEDI A Logs, lots and lots of logs. How to cost effectively monitor logs? Require centralized logging infrastructure Did not have the man power to manage infrastructure P R O B L E M Quick insights: Able to identify and troubleshoot issues in real-time Secure: Integrated w/ AWS IAM Scalable: Cluster sizes are able to grow to accommodate additional log sources B E N E F I T S Streaming AWS CloudTrail logs, application logs, and Docker startup logs to Elasticsearch Created centralized logging service for all team members Using Kibana for visualizations and for Elasticsearch queries S O L U T I O N © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Business and Clickstream Analytics C A S E S T U D Y : FI NANC I AL T I M ES What stories do our readers care about? What’s hot? Required a custom clickstream analytics solution Need a solution that delivers analytics in real-time Did not have a team to manage analytics infrastructure P R O B L E M B E N E F I T S Streaming user data to Amazon ES for analysis. Created their own custom dashboards for editors and journalists – Lantern. Lantern - ”shines a light” on reader activity for the editors and journalists at the FT Critical tool for making editorial decisions. Daily editorial meetings start by looking at Lantern dashboard S O L U T I O N © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Reliability : Lantern is used throughout the day by journalists and editors. Relying on Amazon to manage their systems for maximum uptime. Cost savings: Able to easily tune their cluster to meet their needs with minimal management overhead
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Software & Internet Financial ServicesEducation Technology BioTech and Pharma Media and Entertainment Social Media Telecommunications Travel & Transportation Real Estate Logistics & Operations Publishing Other AMAZO N EL AS T I C S EAR C H S ER V I C E C US T O MER S
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch is a distributed solution You run clusters of instances
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Service architecture AWS SDK AWS CLI AWS CloudFormation Elasticsearch data nodes Elasticsearch master nodes Elastic Load Balancing IAM Amazon CloudWatch AWS CloudTrail Amazon Elasticsearch Service domain
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How many instances? • The index size will be about the same as the corpus of source documents • Double this if you are deploying an index replica • Size based on storage requirements • Either local storage or up to 1.5 TB of Amazon Elastic Block Store (EBS) per instance Example: a 2 TB corpus will need 4 instances Assuming a replica and using EBS Given 1.5 TB of storage per instance, this gives 6TB of storage
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Which instance type? Instance Max Storage* Workload T2 3.5TB You want to do dev/QA M3, M4 150TB Your data and queries are “average” R3, R4 150TB You have higher request volumes, larger documents, or are using aggregations heavily C4 150TB You need to support high concurrency I2, I3 1.5 PB You have XL storage needs
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How you shard is how you scale
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch documents: structured JSON { "verb": "GET", "ident": "-", "bytes": 6245, "@timestamp": "1995-07-01T00:00:01", "request": "GET /history/apollo/ HTTP/1.0", "host": "199.72.81.55", "authuser": "-", "@timestamp_utc": "1995-07-01T04:00:01+00:00", "timezone": "-0400", "response": 200 } • Documents contain fields – name/value pairs • Fields can nest • Value types include text, numerics, dates, and geo objects • Field values can be single or array • When you send documents to Elasticsearch they should arrive as JSON 199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/ HTTP/1.0" 200 6245
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Documents are the core entity ID Field: value Field: value Field: value Field: value
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon ES stores documents in an index Amazon Elasticsearch Service domain ID Field: value Field: value Field: value Field: value _bulk API Index / Type
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Indexes have primary shards Amazon Elasticsearch Service domain ID Field: value Field: value Field: value Field: value _bulk API Index / Type Shards • Shards are the units of storage and compute in an Amazon ES domain • Each shard’s set of documents is distinct • The primary shard count is fixed at creation time • Shards are instances of Apache Lucene • Lucene creates an index for each field in each document
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Indexes have replica shards Amazon Elasticsearch Service domain ID Field: value Field: value Field: value Field: value _bulk API Index / Type Shards • Replica shards are distributed copies of their primaries • The replica shard count is dynamic • Replica shards provide data redundancy and parallelism
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Shards are distributed across instances Amazon Elasticsearch Service domain ID Field: value Field: value Field: value Field: value _bulk API Index / Type Shards Instances • Primary and replica are distributed to different instances • Distribution is initially by shard count • Storage available limits new shard allocation as well
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Shards (Lucene) store indexes on disk Amazon Elasticsearch Service domain ID Field: value Field: value Field: value Field: value _bulk API Index / Type Shards Instances Storage
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How many shards? • Best practice: shards should be < 50GB • Divide index size by ~40GB to get initial shard count • Active shards per instance ~= vCPUs • Always use at least 1 replica for production! Example: 2 TB corpus will need 50 primary shards 2,000 GB / 40GB per shard = 50 shards
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data pattern Amazon ES cluster logs_01.21.2018 logs_01.22.2018 logs_01.23.2018 logs_01.24.2018 logs_01.25.2018 logs_01.26.2018 logs_01.27.2018 Shard 1 Shard 2 Shard 3 host request verb status timestamp etc. Each index has multiple shards Each shard contains a set of documents Each document contains a set of fields and values One index per day
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Load data into Amazon ES
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The components of ingestion Data source Collect Transform Buffer Deliver
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS-Centric solutions Amazon Kinesis Firehose Amazon CloudWatch Logs Logstash AWS IoT Elasticsearch data nodes Elasticsearch master nodes Kibana Data Producers Buffer/ Transform/ Deliver Amazon ES Cluster Analytics UI
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Lambda architectures Files S3 Events Amazon S3 AWS Lambda Function DynamoDB streams Amazon DynamoDB Table AWS Lambda Function Amazon Elasticsearch Service Amazon KinesisData Producers AWS Lambda Function
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Kinesis Firehose delivery architecture • For public access domains • Easily transform data • Serverless with built-in batching, index rollover, error handling S3 bucket source records data source source records Amazon Elasticsearch Service Firehose delivery stream transformed records delivery failure Data transformation function transformation failure
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Secure your data • Publ i c – IAM • Pri vate – end p o i nt • E ncrypti on
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use IAM for public endpoints { "Version": "2012-10-17", "Statement": [ { "Effect":... "Principal": ... "Action": [...], "Resource": ..., "Condition": ... } ] } • Effect: Allow or Deny • Principal: AWS account ID • Action • HTTP verbs • Service actions • Resource: Amazon ES domain/index • Condition: IP Address
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use security groups for private endpoints • Private networking between your VPC and Amazon Elasticsearch Service • Traffic does not traverse the public internet • Use IAM security groups for authentication and access control VPC subnet security group VPC subnet security group Amazon Elasticsearch Service Data Master Data Master IAM IAM Availability Zone B Availability Zone A
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Encrypt your data Elasticsearch data nodes Elasticsearch master nodes AWSKMS • Use your own KMS keys • Encrypted data at rest on Amazon ES instances • Encrypted automatic snapshots
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Make your domain more stable
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use dedicated masters Amazon ES cluster Dedicated master nodes: cluster state Data nodes: queries and updates
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Master node recommendations Number of data nodes Master node instance type < 10 m3.medium+ < 20 m4.large+ <= 50 c4.xlarge+ 50-100 c4.2xlarge+ • In production, always use an odd number of masters, >= 3 • Use fewer, smaller than data nodes
  • 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use zone awareness Amazon ES cluster Availability Zone 1 Availability Zone 2
  • 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. CloudWatch Alarms Name Metric Threshold Periods ClusterStatus.red Maximum >= 1 1 ClusterIndexWritesBlocked Maximum >= 1 1 CPUUtilization/MasterCPUUtilization Average >= 80% 3 JVMMemoryPressure/Master... Maximum >= 80% 3 FreeStorageSpace Minimum <= (25% of avail space) 1 AutomatedSnapshotFailure Maximum >= 1 1
  • 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Slow Logs • Easy console set up • Integrated with CloudWatch Logs • Set thresholds to receive log events corresponding to slow queries and slow indexing • index.search.slowlog.threshold.query.warn • index.search.slowlog.threshold.query.info • index.search.slowlog.threshold.query.debug • index.search.slowlog.threshold.query.trace • index.search.slowlog.threshold.fetch.warn • index.search.slowlog.threshold.fetch.info • index.search.slowlog.threshold.fetch.debug • index.search.slowlog.threshold.fetch.trace • index.indexing.slowlog.threshold.index.warn • index.indexing.slowlog.threshold.index.info • index.indexing.slowlog.threshold.index.debug • index.indexing.slowlog.threshold.index.trace • index.indexing.slowlog.level: trace • index.indexing.slowlog.source: 255 Amazon ES Domain CloudWatch Logs Queries and Updates Slow query logs Slow indexing logs
  • 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Elasticsearch analysis is delivered by Aggregations
  • 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Think buckets and metrics Buckets – a collection of documents meeting some criterion Metrics – calculations on the content of buckets Bucket: time Metric:count
  • 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Dig in to your data in real time
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Demo
  • 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Run Elasticsearch in the AWS Cloud with Amazon Elasticsearch Service Deploy, scale, ingest, secure, monitor, and analyze Start sending your log data today!Amazon Elasticsearch Service