O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 38 Anúncio

Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit

Baixar para ler offline

Amazon Elasticsearch Service gives customers many options for log analytics. From small environments with a single application to large environments where multiple teams log five terabytes or more per day with retention periods that span months, Amazon ES provides a tool kit that gives organizations a holistic view of their application logs. In this session, we discuss effective patterns leveraged by organizations across the AWS ecosystem and gives you foundational knowledge and deployment architectures that will accelerate your goals of building a cost-effective logging solution.

Amazon Elasticsearch Service gives customers many options for log analytics. From small environments with a single application to large environments where multiple teams log five terabytes or more per day with retention periods that span months, Amazon ES provides a tool kit that gives organizations a holistic view of their application logs. In this session, we discuss effective patterns leveraged by organizations across the AWS ecosystem and gives you foundational knowledge and deployment architectures that will accelerate your goals of building a cost-effective logging solution.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit (20)

Anúncio

Mais de Amazon Web Services (20)

Searching for patterns: Log analytics using Amazon ES - ADB205 - New York AWS Summit

  1. 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Searching for patterns: Log analytics using Amazon ES Kevin Fallis Senior Specialist Solutions Architect AWS – Search Services A D B 2 0 5
  2. 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Source: TechCrunch survey of popular open source software from April’17 • Sometimes referred to as the “ELK Stack” – Elasticsearch, Logstash, & Kibana • Distributed search and analytics engine built on Apache Lucene • Easy ingestion and visualization What is Elasticsearch?
  3. 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Machine data driving Elasticsearch growth Machine-generated data is growing 10x faster than business data… Logs, logs, and more logs IT & DevOps: Databases, servers, storage, networking Increase in IoT and Mobile devices: Gaming, sensors, web content Cloud-based architectures Source: insideBigData—The Exponential Growth of Data, February 16, 2017
  4. 4. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Popular use cases Application log monitoring Security event information monitoring Data visualization Full text search
  5. 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Elasticsearch Service (Amazon ES) is a fully managed service that makes it easy to deploy, manage, and scale Elasticsearch and Kibana in the AWS Cloud Amazon Elasticsearch Service
  6. 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Seamless data ingestion, security, auditing, and orchestration Benefits of Amazon ES Drop-in replacement with no need to learn new APIs or skills Deploy a production-ready Elasticsearch cluster in minutes Resize your cluster with a few clicks or a single API call Deploy into your VPC and restrict access using security groups and IAM policies Replicate across Availability Zones, with monitoring and automated self- healing Supports OS APIs and tools Easy to use Scalable Secure Highly available Tightly integrated
  7. 7. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T AWS Cloud Elasticsearch runs on a cluster of instances VPC Data nodes Master nodes AWS Management Console AWS Command Line Interface AWS Tools and SDKs AWS CloudFormation AWS Identity and Access Management (IAM) Elastic Load Balancing (ELB) AWS CloudTrailAmazon CloudWatch AWS Database Migration Service Amazon Kinesis Data Firehose Amazon CloudWatch Logs Amazon ES domain
  8. 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Provides Kibana real-time visualization tool
  9. 9. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Security information and event management (SIEM) IoT & mobile Application monitoring & root-cause analysis Business and web analytics Amazon ES empowers you with the data to understand and intelligently react to your business needs • End-to-end visibility: Better understanding of customers' behavior to improve user experience and react to demand • Improve reliability: Increased operational efficiencies by identifying, solving and preventing system failures in real time • Faster time-to-value: Accelerate time to market with application delivery and performance monitoring • Security: Improved business confidence with end- to-end monitoring of data, infrastructure, and transactions Build actionable insights
  10. 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Central Log Management System https://www.youtube.com/watch?v=fSjAfp-uqSs Case study: Autodesk Highly distributed organization. No consistent way to collect and measure metrics. Small ops team. Must integrate easily with other AWS services. Scale: Accommodate current and future requirements. Must be cost effective with no data lock-in. TBs of log data to sift through to find and fix issues that impact customers. C H A L L E N G E B E N E F I T S Unified log data management solution built on AWS. Single interface for log analytics across applications. Annotate log records to enable distributed tracing states. Streaming application logs via Kinesis Data Firehose to Amazon S3, Amazon Athena, and Amazon ES. 10 i3.4xlarge Amazon ES data nodes – 33 TB. Will grow to 110 TB. Kibana, built-in within Amazon ES, for near real-time analytics and dashboards S O L U T I O N All managed services: “Manage less to gain more.” Focus on developing awesome products. Common vocabulary for diagnosing and solving problems. Eliminated silos. Scalable and cost-effective – i3s delivering great value per TB. Improving customer experience by reducing the time to find and fix customer issues.
  11. 11. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  12. 12. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Application dataServer, application, network, AWS, and other logs Amazon ES domain with index How it works 1. Send data as JSON via REST APIs 2. Data is indexed: All fields searchable, including nested JSON 3. Queries, via REST APIs, allow fielded matching, Boolean expressions, include sorting and analysis 1 2 3 Application users, analysts, DevOps, security
  13. 13. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use the query APIs to retrieve data from Elasticsearch Amazon ES domain Query engine Scoring & sorting Ranked resultsMatches
  14. 14. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T The query engine matches requested field values Field1:value1 Field2:value2 logs_11.28.2018 index F1 index F2 index V1 V2 Vn V1 V2 Vn ID Field: value Field: value Field: value Field: value
  15. 15. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T You use aggregations to analyze log data Amazon ES domain Query engine Matches Analysis engine (aggre- gations) • Histogram • Numeric: sum, min., max. • Terms: bucketing • Nesting
  16. 16. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T All docs 1/51/5 1/5 1/5 1/5 Index ID Field: value Field: value Field: value Field: value Data is stored in an index comprised of shards
  17. 17. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Shards are primary or replica Index Primary shards Replica shards ID Field: value Field: value Field: value Field: value
  18. 18. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Elasticsearch distributes shards to data nodes Queries Updates
  19. 19. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  20. 20. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Overview of delivering logs to Amazon ES Collect Buffer Aggregate Store
  21. 21. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log collectors: Popular options
  22. 22. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log collectors: Properties • Typically read files on a file system • But can receive events with data from things other than file systems • Configuration driven • Can be “lightweight” or “heavyweight” • Lightweight: Consumes as few system resources as possible • Written in C, Ruby, or another efficient language • Agent based: Runs as a service on the OS • Config-driven • Heavyweight: Requires a JVM or other execution engine • Purpose built or leverage “plugins” via configuration • Can perform data transformation
  23. 23. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log buffers: Popular options
  24. 24. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log buffers: Properties • Allow you to decouple producers from consumers • Control the ingest pipeline • Metered consumption of data from consumer fleets • Have “data durability” • Individual events can have a lifecycle outside of Elasticsearch when dealing with sliding windows • Can allow you to replay events • Give you options to involve other business functions • Machine learning • Big data and analytics • Data science • Promote “Lambda” architectures (batch + near-real time)
  25. 25. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log aggregators: Popular options
  26. 26. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Log aggregators: Properties • Aggregate events into one payload for Amazon ES • Give you control of the ingest activity • Allow you to “throttle” the volume of request to Elasticsearch because: • Data nodes have limited space in processing queues • You need to balance query activity with ingest activity • Use the _bulk API to push JSON formatted, grouped events to Elasticsearch • Can be “lightweight” or “heavyweight,” just like forwarders • Can act as interim buffers • Use AWS Auto Scaling to throttle Amazon EC2 or container fleets • Lambda should leverage “concurrency” setting to throttle indexing • In some cases, can “fan out” to multiple destinations other than Elasticsearch for additional business value
  27. 27. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  28. 28. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Patterns help you build solutions quickly • Asserted • Others have done this • Extensible • Prescriptive • Repeatable • Verifiable • Natively on AWS if using AWS CloudFormation and AWS Config
  29. 29. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T VPC option presents challenges to architectures • Elastic Network Interfaces (ENIs) get presented to consumers of the Amazon ES • This means all traffic to your domain is private and must be accessed from within the VPC • ENIs cannot be presented to external services without a proxy, AWS PrivateLink or VPC peering • DNS resolution of the endpoint is private • You cannot present one Amazon ES domain to more that one VPC • Kibana access via Amazon Cognito must be brokered with a proxy • NGINX • Apache • Amazon Kinesis Data Firehose will eventually support VPC endpoints
  30. 30. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon S3 event notifications approach
  31. 31. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon Kinesis approach
  32. 32. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  33. 33. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Data retention is directly proportional to cost • Do you really need to log it? • Remove irrelevant fields • For example, are you really using that user-agent field in your access logs? • Transform string values into integers • For example, VPC Flow Logs contain a field called “action” and “status.” You could transform these character fields to enumerations • Do your customers need larger retention periods? • Most data is actionable in a “hot” time period • Consider smaller retention periods unless the business case dictates otherwise • Use a “forensic cluster” that is populated by manual snapshots as needed • Audits • Historical trend analysis
  34. 34. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Pattern: Time-based indexes for log analytics • You use a root string, e.g., logs_. • Depending on volume, rotate at regular intervals, normally daily. • Daily indexes simplify index management. Delete the oldest index to create more space on your cluster. • Use aliases to query aggregate indices. logs_2019.07.01 logs_2019.07.02 logs_2019.07.03 logs_2019.07.04 logs_2019.07.05 logs_2019.07.06 logs_2019.07.07
  35. 35. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Going deeper on index management—aliases Aliases enable you to query multiple indices using a reference name • Begin by creating a new index that fits the pattern-defined using settings • Adjust the alias to include the new index name, for example ‘logs_2019.07.01’ • Remove the oldest index from the alias for example ‘logs_2019.07.01’ • Manual snapshot the oldest index • Drop the oldest index
  36. 36. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Forensic cluster pattern • Amazon CloudWatch Events trigger Lambda, which invokes curator to manage indices • Create a schedule in Amazon CloudWatch for the event • Create a snapshot repository • AWS Lambda creates a metadata record in Amazon DynamoDB for the snapshot with a state of “starting” • Lambda calls curator to manage the indexes via API • Snapshot is kicked off asynchronously • Lambda updates the metadata record to a state of started • Another scheduled event checks the snapshot using the _snapshots API to query the status. It should be in a “SUCCESS” status, and you can mark the snapshot “complete” • Code for error scenarios • Create a new cluster • Restore snapshots based on metadata records in Amazon DynamoDB
  37. 37. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Wrap up • Machine-generated data is growing rapidly, driven by DevOps, cloud infrastructure, and IoT • Logs contain valuable insights: what your users are doing, whether you have bad actors, & what's happening at your devices • Amazon ES enables ingesting and analyzing logs in real time to provide you with the data and insights you need
  38. 38. Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kevin Fallis kffallis@amazon.com

×