SlideShare uma empresa Scribd logo
1 de 37
Baixar para ler offline
Public Cloud Computing
Workshop
Amer Ather
Netflix Cloud Performance Architect
What is a Cloud
● Abstraction of underlying IT. resources
● On-demand resource provisioning via self service layer
● API driven infrastructure
● Cloud is not just virtualization. Virtualization is among many
technologies that cloud uses to manage physical infrastructure.
● Cloud can span across multiple geographical locations.
Cloud capabilities can be set up for public or private access
us-west-2
us-east-1
eu-west-1
Public Cloud Computing
● Cloud computing enables companies or individuals to consume compute
resources like a utility rather than building their own.
● Compute services are hosted on Public Cloud providers (Amazon, Azure,
Google..) infrastructure instead of data centers.
Cloud Computing Benefits
● Elasticity of compute resources
● Pay-Per-Use
● Self Service on-demand Provisioning
● Cloud API and Integration
● Managed Services
● Economy of scale
● Tier pricing model
● Resilience via Availability Zones and Regions
● Give rise to immutable Infrastructure
● No more hardware debugging. Terminate the bad
instance and provision a new one.
Types of Cloud Computing
● Infrastructure-as-a-Service(IaaS)
○ Customers launch VMs (Virtual Machines) in public cloud managed
infrastructure.
○ Customer manages infrastructure via self service interface (web, api, cli) over
the internet. Infrastructure components include: VM, machine Images, DNS,
storage, networking, patching, monitoring, security etc..
● Platform-as-a-Service(PaaS)
○ Hides complexity of managing infrastructure.
○ Cloud provider handles capacity provisioning: VM launch, load balancing,
auto-scaling, patching, monitoring etc..
○ Targeted for developers. Developers simply upload the code and cloud providers
do the rest.
○ Examples: AWS Elastic Beanstalk, Google App Engine
Types of Cloud Computing (cont.)
● Software-as-a-Service(SaaS)
○ Application hosting in the cloud.
○ Customer access services via web interface over the Internet
○ SaaS provider use subscription model
○ Examples: Salesforce.com, Dropbox, Gmail, Flicker ..
● Function-as-a-Service(FaaS)
○ Serverless computing. No infrastructure to maintain or pay.
○ New compute paradigm. Application is built in bite-sized business logic
○ A function is a single purpose block of code performing a single task.
○ Functions run on public cloud infrastructure.
○ You pay for the amount of time the code is running (nearest 100 ms).
○ Functions are ephemeral, they run on-demand in response to an event
○ Examples: AWS Lambda, Google Cloud Function
Public Cloud Managed Services (AWS)
● Cloudformation: Template to model and provision cloud infrastructure
● RDS: Hosted database solution (sql, nosql): mySQL, Oracle, DynamoDB.
● Data Lake: Store structured/unstructured data on S3 that can be used to run ad hoc
queries or ingest into warehouse or hadoop/spark clusters for analytics
● Elastic Cache: Object (memcache) and key-value pair (redis) caching engine
● Amazon ElasticSearch: Server log and full text search for near real time analytics
● RedShift: Data warehousing service
● EMR: On-demand hadoop cluster for big data processing
● Amazon IoT: Connect devices to cloud and use AWS services
● Elastic Container Service: Container (Docker) orchestration in public cloud
● Amazon Lambda: Run functions in response of events from cloud services
● Elastic File System: Managed NFS service
● Amazon Kinesis: Collect and analyze streaming data for real time insight (kafka)
● Amazon SageMaker: Build, train, and deploy machine learning models at cloud scale
Cloud Native Application
● Application written to have cloud in mind
● Stateless and self healing
● Support data sharding
● CI/CD and DevOps
● Red/Black Deployment
● Use microservice or serverless architecture, if possible
● Leverage public cloud API to build new features quickly
● Auto scale and Health check
Some open source projects that scales well in cloud: Hadoop/Spark, Machine Learning, NoSQL,
memcache, redis, Elasticsearch, kafka
Microservices
● It is an alternative to traditional monolithic application architecture
● Loosely coupled service oriented architecture (SOA) with bounded contexts
● Massively scalable due to loose coupling, stateless model and sharded data
● Decomposition of single application into a suite of small services each
implementing different sets of business logic.
● Each service runs independently and interact with open protocol (API)
● HTTP/REST and gRPC are common API used for service interaction.
● Common payloads used for data exchange: JSON, XML, Protocol Buffers
● Forces design of clear interfaces
Microservices (cont.)
● Each service is independently built, deployed, upgrade and scaled.
● Lower learning curve for a new team member due to bounded context
● Services may be written in different languages: java, go, python, nodejs etc.
● Services can be deployed as web container (Tomcat) in a VM or Docker
● Service is free to select any datastore technology (redis, memcache,
elasticsearch, cassandra, mongoDB etc.) that suites its use case
● Stateful cached datastore can be built via replicated ephemeral instances
Monolithic vs. Microservices Architecture
Netflix Microservices Architecture (Netflix OSS)
Spinnaker
DevOps
CI/CD
Tooling
Edda
(Archaius)
Config Mgmt.
Eureka
Prana
Discovery
Zuul
Ribbon
Routing
Hystrix
Atlas
Observability
Ephemeral datastores
Dynomite, Memcached, Priam, Cassandra
Orchestration
Auto-scaling Groups(AWS), Titas (Netflix PaaS using Mesos, Docker), Elastic Container Service (AWS)
Build Environment
Java (majority), Groovy, Scala, Python, Ruby, php, nodejs
Policy Conformance
Simian Army, Chaos Monkey, Conformity Monkey, Janitor Monkey
spigo: open source software that simulates Netflix style microservices and interactions
Microservices with Spring Cloud: Online course on building microservices with Netflix OSS
Deep dive into Netflix Microservices
Monolithic vs. Microservices Architecture (cont.)
Traditional Data Center Architecture Cloud Architecture (microservices)
Monolithic and Centralized Decomposed and decentralized
Design for predictable scalability Design for elastic scale
Relational database Polyglot persistence (mix of data storage engines)
Strong consistency Eventual consistency
Shared dataset Sharded dataset
Serial and synchronized processing Parallel and async processing
Design to avoid failures Design for failure
Infrequent and slower updates Frequent small updates (More features)
Manual management Self-management (DevOps, CI/CD pipeline)
Failures may result in an outage Immutable infrastructure
REST Web API
● A RESTful API is a platform that exposes data as a resource on which to operate
● All client actions to resource are represented by HTTP CRUD methods:
○ POST - Create | GET: Read | PUT: Update | DELETE: Delete
● Response is returned in JSON. HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response
● URL is a unique identifier that describes the resource in application.
● A simple client (curl) can be used to invoke REST methods in application
● Each request/response is stateless. Client maintains state and responsible for providing it for server
to fulfill that request
website
Mobile
Partner
Integration
Third Party
Apps
API
GatewayEdge
services
Backend
Services
HTTP Transactions
clients
Well defined interaction with clients
and front end service
Continuous Integration and Deployment (CI/CD)
Server Virtualization - Evolution
Courtesy Brendan Gregg
Cloud Instance
● Virtual Machine (VM) hosted in public cloud is
called Cloud instance
● Hypervisor (xen, kvm) is used for virtualizing
physical machine hardware
● VM or guest is bounded to subset of physical
resources
● Multiple OS. (window, Linux, Solaris) can run
concurrently on the same physical machine
● Intel hardware assisted Virtualization
technologies (VTx, VT-d, VPID, EMT, SR-IOV)
have narrowed the performance gap between
VM and bare-metal
Cloud Instance Families (AWS)
AWS offers wide range of cloud
instance families with varying
processing capabilities
https://aws.amazon.com/ec2/instance-types/
Purchasing options:
● Bare-metal (most expensive)
● Dedicated
● On-demand
● Reserved
● Spot (least expensive)
Instance Family Purpose
T2 Burstable Performance
C5 Compute Optimized
R4 Memory Optimized
I3 Storage Optimized (SSD)
D2 Dense Storage Optimized (HDD)
M5 General Purpose
X1 Large Hardware Configuration
P3 GPU General Purpose
G3 Graphics Intensive
F1 FPGA (custom hardware)
Cloud Instance Types
(AWS)
AWS offers cloud instances with
different hardware configuration
within an instance family
https://aws.amazon.com/ec2/instance-types/
Instance Type vCPU Memory
(GB)
Storage Network
(Mbps)
Comment
nano,micro,small 1 0.5 - 2.0 Net only 300 Burstable, T2 only
medium 2 4 Net only 300 - 700 t2 and m3 only
large 2 3-15 Direct/Net 500-700 All instances
except: D2, M4,
X1
xlarge 4 16-32 Direct/Net 700 - 1000 all families except
x1, p2
2xlarge 8 32-60 Direct/Net 1000 - 2000 all families except
x1,p2
4xlarge 16 30-122 Direct/Net 10,000 all families except
t2, x1, p2
8xlarge 32 60-244 Direct/Net 10,000 all families except
m4, t2, x1,
10xlarge 40 256 Net only 20,000 m4 only
16xlarge 64 256 - 488 Direct/Net 20,000 r4,m4,i3,x1 only
32xlarge 128 1,952 Direct/Net 20,000 x1 only
Cloud Instance Features (AWS)
Instance Feature Comment
CPU Types Various generations of Intel processors and models: Ivy Bridge, Sandy Bridge,
Broadwell, Haswell, Skylake
Enhanced Networking Low latency and High throughput networking using SR-IOV PCIe NIC. Native driver
(sriov, ena) runs inside the VM and have direct access (DMA) to NIC hardware. In the
absence of Enhanced Networking feature, virtualized xen driver is used that is prone to
higher latencies.
Ephemeral
Direct attached Storage
Some instance families offer direct attached SSD, HDD and NVMe storage. Direct
attached storage is called Ephemeral because storage life span is limited to instance
lifespan. Once instance is terminated, ephemeral storage is also lost. NVMe storage is
available in I3 family that provide access to storage using native driver (nvme) that runs
inside the VM and have direct access (DMA) to storage via SR-IOV. In the absence of
SR-IOV extension for storage, SSD and HDD are access via virtualized xen driver that
is prone to higher IO latencies.
EBS Optimized
Network Storage
EBS optimized instances have dedicated Network link for accessing EBS network
storage. This allows instances to keep storage traffic separate from the application
network traffic.
Burstable Performance AWS offers burstable cpu, io and network performance to achieve higher than baseline
performance for a shorter period of time. Burstable feature is ideal for bursty workloads
that require higher (burst) performance for a short period. You pay fraction of price to
achieve burst as compared to fixed performance.
Amazon instances
support
performance
features to
improve compute,
IO and network
performance
Spot Instances (AWS)
● Spare compute capacity in AWS public cloud is sold via bidding system.
● Spot instances get steep discount (upto 90%) compare to on-demand prices
● Spot instances can be taken away at 2 minutes notice whenever AWS needs
the capacity back or spot price has increased.
● Shorter run-time jobs or application that is interruptible are good candidates to
run on spot instances.
● Spot instance has an option to hibernate when instance is about to terminate
due to spot price changes. When capacity is available application resumes
where it was paused.
○ Upon hibernation, you pay for storage cost only.
Cloud Image (AMI)
Ubuntu Base AMI
Java
GC and thread
dump logging
Tomcat
Application
servlet, base
server,
platform,
interface jars
for dependent
services
Atlas Monitoring
Optional Apache
front end,
memcached,
non-java apps
Healthcheck,
status servlets,
JMX interface
Auto Scaling in Public Cloud (Elasticity)
● Microservice architecture allows each service to
be scaled independent of other services
● AWS Auto Scaling Group (ASG) is a group of
same type of instances running the same
service. ASG can:
○ Scale up/down instances to meet varying demands on the
service
○ Replace unhealthy or terminated instance
○ Monitor AWS system and application metrics periodically
via Amazon Cloudwatch service to trigger scaling event.
○ Easy to setup via single policy to manage instance
capacity via Target Tracking feature:
■ Target Tracking acts like a thermostat that strives to
keep the metric close to a desired value.
Netflix uses predictive auto scaling
policy that scales up early in
anticipation of load and scale down
slowly to avoid causing resource
shortages. It takes into account for
public holidays, big public events
Bootstrapping a Cloud Instance
● AWS offers metadata service that publishes instance metadata and custom
supplied user data and script that can be fetched at instance launch time.
● Application or configuration script can use instance attributes such as:
instance-id or type, public hostname, IP address, AMI-id, AZ etc.. to
configure instance at launch time. Use cases:
○ Launch an instance and have it register itself to DNS service
○ Launch an instance with “Golden Image” and install additional patches/software on it
○ Run automated Test bed to perform different tests depending on instance type.
● AWS metadata service is hosted at: http://169.254.169.254/latest/meta-data
● AWS Cloudinit executes user supplied data script at the first boot cycle of
instance.
Cloud Storage (AWS)
● Elastic Block Storage (Network Block Storage): Network storage optimized for
IO throughput and low latency.
○ IO1: SSD backed network storage with bounded IO latency (most expensive)
○ GP2: SSD backed network storage for lower IO latency
○ ST1: Magnetic Disk backed storage for Higher IO Throughput
○ SC1: Magnetic Disk backed storage for Moderate IO Throughput (least expensive)
● Ephemeral Storage: High performance direct attached storage. Data is lost on
instance termination. Comes in variety of flavors:
○ Magnetic Disk (attached to D2, H1 instance families)
○ SSD (attached to I2, R3 .. instance families)
○ NVMe (attached to i3 instance family)
● EFS (NFS Managed Service): Shared storage that can be accessed
concurrently by hundreds of cloud instances spanning across multiple AZ
S3 Object Storage (AWS)
● Manages data as objects
● Each object is self identifiable and
discoverable by including metadata and
globally unique identifier
● Most cost-efficient and scalable method of
storing data in the public cloud
● Flat model makes it scalable and searchable
even when object count reaches in trillions
● AWS offers API to interface with S3 objects
● Instances with ephemeral storage periodically
backs up data to S3 buckets.
● Ability to scale to millions of operations/sec
and GBs of throughput
Netflix Cloud Native Storage is built around
ephemeral instances and storage
Cassandra Backup
S3
Cassandra Nodes
us-east-1c
Cassandra Nodes
us-east-1d
Cassandra Nodes
us-east-1e
Cloud Networking (AWS)
Virtual Private Cloud (VPC):
● Instances are launched into an isolated VPC in a virtual network with
Internet Gateway already configured.
● VPC resembles data center with full control on virtual networks.
● Default VPC spans to all AZ with one subnet in each. You are free to
define more subnets.
● VPC has CIDR block /16 network (65k IP addresses). Subnet mask of
/20 allows 4096 IP per subnet.
● Security is applied at instance (security group) and subnet level (ACL)
Public and Private IP Address:
● New instances are assigned randomly generated public and private IP
addresses. For non-default VPC, only private IP address is assigned
● Private IP address of instance are mapped to Public IP via NAT.
● Private IP is used inside AWS cloud within the same region.
● Public IP is used for Internet and AWS inter-region traffic
Cloud Availability (Failover)
Higher availability or service failover require same IP address to be assigned to a newly launched
instance after instance termination.
Elastic IP (EIP):
● EIP is a permanent public IP address that can be assigned to a running instance in any AZ.
● Masks failure of instance. No delay in DNS propagation due to persistent IP address
● EIP is owned by account. There is small charge for unused EIPs per account.
● Automation using a script that allocate EIP to a running instance
Elastic Network Interface (ENI):
● A virtual NIC that can be attached to a running instance in addition to primary NIC (eth0)
● ENI is per subnet and thus require creating ENI for each subnet that you plan an instance to run
● ENI has an associated properties: Private IP, EIP, Security Group etc..
● When ENI is attached to a new instance, all ENI properties are migrated with it.
● Useful for redirect traffic or configuring a seperate network for administration and backup
Cloud Availability ( Elastic Load Balancer)
● Load Balancer (LB) distributes incoming network traffic across group of cloud instances
● LB performs periodic health check and stops sending traffic to unhealthy instance
● AWS ASG (Auto Scaling Group) and LB work together. ASG is responsible for replacing a bad or
terminated instance and LB job is to resume traffic to healthy instances.
● LB supports features like: SSL Termination, Sticky Sessions, Idle Connection Timeout,
Connection Draining etc.
● Instances behind LB need private IP address only
● Classic LB runs at TCP layer (layer 4) and thus use TCP ports to direct traffic
● Application LB (ALB) makes routing decision at the Application layer (layer 7).
● Unlike Classic LB, one ALB can route traffic to multiple services
● ALB supports HTTP/HTTPs and thus have more context and flexibility in routing traffic
● ALB supports content based routing that allows traffic to be routed based on URL
● ALB supports dynamic port mapping that allows load balance across two containers (Docker) of
same service running on the same instance. Without ALB, you may require two instances for load
balance the same service.
AWS Route 53 - DNS Service
● Self managed DNS service in AWS cloud
● You can register and park your domain name
(cloudperf.net ) with route 53 service
● You own any subdomain such as
techblog.cloudperf.net.
● Instead of dynamically assign names of cloud
instances, give them custom names
● Route53 supports health check and various
routing policies: Latency Routing, Failover
Routing, Geolocation Routing
Containers (OS. Virtualization)
● Lightweight virtualization supported by
operating system (kernel) to create isolated
user-space instances
● No hypervisor is required!
● Container can run instruction native to CPU
without any special interpretation
● Goal is to create application execution
environment that mimics standard linux
install without requiring a separate kernel
Docker Container
● Open source container service, similar to lxc
● Application centric instead of machine centric view
● More nimble than VM due to smaller footprints
● Portable across data centers and public clouds
● Immutable. Changes to container image is lost on termination.
● Support Open standard libraries: libcontainer, libswarm..
● Containers are created from a read-only template called an image
○ Simple template (Dockerfile or Docker compose) are used for building
docker images for single and multi-container applications.
● All required dependencies (code, runtime, system tools and libraries etc.)
are baked into the container, thus allow the software to run the same way
regardless of environment
Docker Image
Docker image is built using multiple layers:
● Base: boot file system. Unmounted after
container is booted
● rootfs: It can be any Linux distro (Ubuntu,
RedHat..). Mounted as read-only root file system
● union mount: Docker uses union mount to add
more read-only file systems (called images) on
top of root file system.
● Container Image: When a container image is
launched, it is mounted as read-write file system
This is where application/process inside Docker
container run.
Docker images are stored in a public or private registry from which they
can be downloaded and run on the cluster
Container (Docker) Orchestration
● Orchestration framework is required to manage fleet of cloud instances
where docker containers can be deployed
● Container Orchestration framework abstracts the infrastructure and make
the entire fleet of instances or cluster as a single deployment target.
● Container orchestration typically involves container scheduling,
deployment, replication, scaling, monitoring, management, and failover
● Public Cloud container services:
○ Amazon ECS
○ Azure Container Service
○ Google Container Engine
● Popular Container Orchestration framework:
○ Kubernetes
○ Mesos
○ Docker Swarm
○ CoreOS Fleet
Cloud Monitoring and Resource Tagging (AWS)
CloudWatch
● AWS service for monitoring AWS resource metrics to gain visibility on resource
utilization and performance
● Application can also store custom metrics and logs into CloudWatch
● Metrics can be polled to set an alarm or alert when a threshold is met.
Tagging
● AWS allows cloud resources to be tagged using a key and a value (optional).
● Allows companies to perform internal tracking of resource usage across departments
(sales, marketing) and billing.
● Tags can also be used to identify cloud resources used in prod and test environment
● Cloud resource with tag can be searched and filter to perform an action as a group.
Multi-Tier Cloud Security (AWS)
● AWS Security Keys: used for cloud resource provisioning
● Key Pair: public/private key to authenticate ssh/login into cloud instance
● IAM Users/groups: are given limited access to cloud resources by attaching a
policy that lists cloud resources that a user/group can provision or access.
● IAM Roles: Allows assigning temporary security credential to application
running on cloud instance or mobile device for access to aws services and
resources. Example: InstanceProfile, AssumeRole API
● Security Group (SG): Implements security (Firewall) at the cloud instance
● Access Control List (ACL): Implement security at the network level
● AWS CloudTrail Service: Event history of account activity to perform security
analysis, resource charge tracking and troubleshooting.
Multi-Tier Cloud Security (AWS) - Cont.
● Meets several compliance requirements (financial, healthcare, govt.)
● DDoS Mitigation
● Data encryption on transit (TLS) and at rest for better data privacy
● Highly secure AWS data centers
● Multi-Factor Authentication for privileged accounts.
● Integration with corporate directory using AWS Directory Service to easily
migrate directory aware on-premises workloads.

Mais conteúdo relacionado

Mais procurados

How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
StreamNative
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
kawamuray
 

Mais procurados (20)

Easily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon CrosbyEasily Build a Smart Pulsar Stream Processor_Simon Crosby
Easily Build a Smart Pulsar Stream Processor_Simon Crosby
 
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
Billions of Messages in Real Time: Why Paypal & LinkedIn Trust an Engagement ...
 
How to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba CloudHow to Set Up ApsaraDB for RDS on Alibaba Cloud
How to Set Up ApsaraDB for RDS on Alibaba Cloud
 
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
 
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
How Pulsar Enables Netdata to Offer Unlimited Infrastructure Monitoring for F...
 
An Overview to Networking in the AWS Cloud for Education [Webinar Slides]
An Overview to Networking in the AWS Cloud for Education [Webinar Slides]An Overview to Networking in the AWS Cloud for Education [Webinar Slides]
An Overview to Networking in the AWS Cloud for Education [Webinar Slides]
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
 
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINEKafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
Kafka Multi-Tenancy - 160 Billion Daily Messages on One Shared Cluster at LINE
 
Deploying Machine Learning Models with Pulsar Functions - Pulsar Summit Asia...
Deploying Machine Learning Models with Pulsar Functions  - Pulsar Summit Asia...Deploying Machine Learning Models with Pulsar Functions  - Pulsar Summit Asia...
Deploying Machine Learning Models with Pulsar Functions - Pulsar Summit Asia...
 
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
Replicated Subscriptions: Taking Geo-Replication to the Next Level - Pulsar S...
 
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
How Narvar Uses Pulsar to Power the Post-Purchase Experience - Pulsar Summit ...
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
 
How Zillow Unlocked Kafka to 50 Teams in 8 months | Shahar Cizer Kobrinsky, Z...
How Zillow Unlocked Kafka to 50 Teams in 8 months | Shahar Cizer Kobrinsky, Z...How Zillow Unlocked Kafka to 50 Teams in 8 months | Shahar Cizer Kobrinsky, Z...
How Zillow Unlocked Kafka to 50 Teams in 8 months | Shahar Cizer Kobrinsky, Z...
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 
Building the Next-Generation Messaging Platform on Pulsar at Intuit - Pulsar ...
Building the Next-Generation Messaging Platform on Pulsar at Intuit - Pulsar ...Building the Next-Generation Messaging Platform on Pulsar at Intuit - Pulsar ...
Building the Next-Generation Messaging Platform on Pulsar at Intuit - Pulsar ...
 
Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
Kafka Summit NYC 2017 - Cloud Native Data Streaming Microservices with Spring...
 

Semelhante a Public Cloud Workshop

CloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computingCloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computing
Fotis Stamatelopoulos
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
OpenNebula Project
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
Ignacio M. Llorente
 

Semelhante a Public Cloud Workshop (20)

cc.pptx
cc.pptxcc.pptx
cc.pptx
 
Dbms
DbmsDbms
Dbms
 
pertemuan-2-introduction-to-cloud.pdf
pertemuan-2-introduction-to-cloud.pdfpertemuan-2-introduction-to-cloud.pdf
pertemuan-2-introduction-to-cloud.pdf
 
Introduction to Windows Azure
Introduction to Windows AzureIntroduction to Windows Azure
Introduction to Windows Azure
 
Introduction to AWS & Cloud Services
Introduction to AWS & Cloud ServicesIntroduction to AWS & Cloud Services
Introduction to AWS & Cloud Services
 
Cloud and its job oppertunities
Cloud and its job oppertunitiesCloud and its job oppertunities
Cloud and its job oppertunities
 
Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
 
CloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computingCloudCamp Athens presentation: Introduction to cloud computing
CloudCamp Athens presentation: Introduction to cloud computing
 
Cloudcomputing
CloudcomputingCloudcomputing
Cloudcomputing
 
[WSO2Con Asia 2018] Architecting for Container-native Environments
[WSO2Con Asia 2018] Architecting for Container-native Environments[WSO2Con Asia 2018] Architecting for Container-native Environments
[WSO2Con Asia 2018] Architecting for Container-native Environments
 
Cloud Architecture best practices
Cloud Architecture best practicesCloud Architecture best practices
Cloud Architecture best practices
 
introduction to micro services
introduction to micro servicesintroduction to micro services
introduction to micro services
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
Cloud & Data Center Networking
Cloud & Data Center NetworkingCloud & Data Center Networking
Cloud & Data Center Networking
 
AWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the CloudAWS Webcast - Website Hosting in the Cloud
AWS Webcast - Website Hosting in the Cloud
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
ISC Cloud 2013 - Cloud Architectures for HPC – Industry Case Studies
 
002 AWSSlides.pdf
002 AWSSlides.pdf002 AWSSlides.pdf
002 AWSSlides.pdf
 
cloudcomputing.pptx
cloudcomputing.pptxcloudcomputing.pptx
cloudcomputing.pptx
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Public Cloud Workshop

  • 1. Public Cloud Computing Workshop Amer Ather Netflix Cloud Performance Architect
  • 2. What is a Cloud ● Abstraction of underlying IT. resources ● On-demand resource provisioning via self service layer ● API driven infrastructure ● Cloud is not just virtualization. Virtualization is among many technologies that cloud uses to manage physical infrastructure. ● Cloud can span across multiple geographical locations. Cloud capabilities can be set up for public or private access us-west-2 us-east-1 eu-west-1
  • 3. Public Cloud Computing ● Cloud computing enables companies or individuals to consume compute resources like a utility rather than building their own. ● Compute services are hosted on Public Cloud providers (Amazon, Azure, Google..) infrastructure instead of data centers.
  • 4. Cloud Computing Benefits ● Elasticity of compute resources ● Pay-Per-Use ● Self Service on-demand Provisioning ● Cloud API and Integration ● Managed Services ● Economy of scale ● Tier pricing model ● Resilience via Availability Zones and Regions ● Give rise to immutable Infrastructure ● No more hardware debugging. Terminate the bad instance and provision a new one.
  • 5. Types of Cloud Computing ● Infrastructure-as-a-Service(IaaS) ○ Customers launch VMs (Virtual Machines) in public cloud managed infrastructure. ○ Customer manages infrastructure via self service interface (web, api, cli) over the internet. Infrastructure components include: VM, machine Images, DNS, storage, networking, patching, monitoring, security etc.. ● Platform-as-a-Service(PaaS) ○ Hides complexity of managing infrastructure. ○ Cloud provider handles capacity provisioning: VM launch, load balancing, auto-scaling, patching, monitoring etc.. ○ Targeted for developers. Developers simply upload the code and cloud providers do the rest. ○ Examples: AWS Elastic Beanstalk, Google App Engine
  • 6. Types of Cloud Computing (cont.) ● Software-as-a-Service(SaaS) ○ Application hosting in the cloud. ○ Customer access services via web interface over the Internet ○ SaaS provider use subscription model ○ Examples: Salesforce.com, Dropbox, Gmail, Flicker .. ● Function-as-a-Service(FaaS) ○ Serverless computing. No infrastructure to maintain or pay. ○ New compute paradigm. Application is built in bite-sized business logic ○ A function is a single purpose block of code performing a single task. ○ Functions run on public cloud infrastructure. ○ You pay for the amount of time the code is running (nearest 100 ms). ○ Functions are ephemeral, they run on-demand in response to an event ○ Examples: AWS Lambda, Google Cloud Function
  • 7. Public Cloud Managed Services (AWS) ● Cloudformation: Template to model and provision cloud infrastructure ● RDS: Hosted database solution (sql, nosql): mySQL, Oracle, DynamoDB. ● Data Lake: Store structured/unstructured data on S3 that can be used to run ad hoc queries or ingest into warehouse or hadoop/spark clusters for analytics ● Elastic Cache: Object (memcache) and key-value pair (redis) caching engine ● Amazon ElasticSearch: Server log and full text search for near real time analytics ● RedShift: Data warehousing service ● EMR: On-demand hadoop cluster for big data processing ● Amazon IoT: Connect devices to cloud and use AWS services ● Elastic Container Service: Container (Docker) orchestration in public cloud ● Amazon Lambda: Run functions in response of events from cloud services ● Elastic File System: Managed NFS service ● Amazon Kinesis: Collect and analyze streaming data for real time insight (kafka) ● Amazon SageMaker: Build, train, and deploy machine learning models at cloud scale
  • 8. Cloud Native Application ● Application written to have cloud in mind ● Stateless and self healing ● Support data sharding ● CI/CD and DevOps ● Red/Black Deployment ● Use microservice or serverless architecture, if possible ● Leverage public cloud API to build new features quickly ● Auto scale and Health check Some open source projects that scales well in cloud: Hadoop/Spark, Machine Learning, NoSQL, memcache, redis, Elasticsearch, kafka
  • 9. Microservices ● It is an alternative to traditional monolithic application architecture ● Loosely coupled service oriented architecture (SOA) with bounded contexts ● Massively scalable due to loose coupling, stateless model and sharded data ● Decomposition of single application into a suite of small services each implementing different sets of business logic. ● Each service runs independently and interact with open protocol (API) ● HTTP/REST and gRPC are common API used for service interaction. ● Common payloads used for data exchange: JSON, XML, Protocol Buffers ● Forces design of clear interfaces
  • 10. Microservices (cont.) ● Each service is independently built, deployed, upgrade and scaled. ● Lower learning curve for a new team member due to bounded context ● Services may be written in different languages: java, go, python, nodejs etc. ● Services can be deployed as web container (Tomcat) in a VM or Docker ● Service is free to select any datastore technology (redis, memcache, elasticsearch, cassandra, mongoDB etc.) that suites its use case ● Stateful cached datastore can be built via replicated ephemeral instances
  • 12. Netflix Microservices Architecture (Netflix OSS) Spinnaker DevOps CI/CD Tooling Edda (Archaius) Config Mgmt. Eureka Prana Discovery Zuul Ribbon Routing Hystrix Atlas Observability Ephemeral datastores Dynomite, Memcached, Priam, Cassandra Orchestration Auto-scaling Groups(AWS), Titas (Netflix PaaS using Mesos, Docker), Elastic Container Service (AWS) Build Environment Java (majority), Groovy, Scala, Python, Ruby, php, nodejs Policy Conformance Simian Army, Chaos Monkey, Conformity Monkey, Janitor Monkey spigo: open source software that simulates Netflix style microservices and interactions Microservices with Spring Cloud: Online course on building microservices with Netflix OSS Deep dive into Netflix Microservices
  • 13. Monolithic vs. Microservices Architecture (cont.) Traditional Data Center Architecture Cloud Architecture (microservices) Monolithic and Centralized Decomposed and decentralized Design for predictable scalability Design for elastic scale Relational database Polyglot persistence (mix of data storage engines) Strong consistency Eventual consistency Shared dataset Sharded dataset Serial and synchronized processing Parallel and async processing Design to avoid failures Design for failure Infrequent and slower updates Frequent small updates (More features) Manual management Self-management (DevOps, CI/CD pipeline) Failures may result in an outage Immutable infrastructure
  • 14. REST Web API ● A RESTful API is a platform that exposes data as a resource on which to operate ● All client actions to resource are represented by HTTP CRUD methods: ○ POST - Create | GET: Read | PUT: Update | DELETE: Delete ● Response is returned in JSON. HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response ● URL is a unique identifier that describes the resource in application. ● A simple client (curl) can be used to invoke REST methods in application ● Each request/response is stateless. Client maintains state and responsible for providing it for server to fulfill that request website Mobile Partner Integration Third Party Apps API GatewayEdge services Backend Services HTTP Transactions clients Well defined interaction with clients and front end service
  • 15. Continuous Integration and Deployment (CI/CD)
  • 16. Server Virtualization - Evolution Courtesy Brendan Gregg
  • 17. Cloud Instance ● Virtual Machine (VM) hosted in public cloud is called Cloud instance ● Hypervisor (xen, kvm) is used for virtualizing physical machine hardware ● VM or guest is bounded to subset of physical resources ● Multiple OS. (window, Linux, Solaris) can run concurrently on the same physical machine ● Intel hardware assisted Virtualization technologies (VTx, VT-d, VPID, EMT, SR-IOV) have narrowed the performance gap between VM and bare-metal
  • 18. Cloud Instance Families (AWS) AWS offers wide range of cloud instance families with varying processing capabilities https://aws.amazon.com/ec2/instance-types/ Purchasing options: ● Bare-metal (most expensive) ● Dedicated ● On-demand ● Reserved ● Spot (least expensive) Instance Family Purpose T2 Burstable Performance C5 Compute Optimized R4 Memory Optimized I3 Storage Optimized (SSD) D2 Dense Storage Optimized (HDD) M5 General Purpose X1 Large Hardware Configuration P3 GPU General Purpose G3 Graphics Intensive F1 FPGA (custom hardware)
  • 19. Cloud Instance Types (AWS) AWS offers cloud instances with different hardware configuration within an instance family https://aws.amazon.com/ec2/instance-types/ Instance Type vCPU Memory (GB) Storage Network (Mbps) Comment nano,micro,small 1 0.5 - 2.0 Net only 300 Burstable, T2 only medium 2 4 Net only 300 - 700 t2 and m3 only large 2 3-15 Direct/Net 500-700 All instances except: D2, M4, X1 xlarge 4 16-32 Direct/Net 700 - 1000 all families except x1, p2 2xlarge 8 32-60 Direct/Net 1000 - 2000 all families except x1,p2 4xlarge 16 30-122 Direct/Net 10,000 all families except t2, x1, p2 8xlarge 32 60-244 Direct/Net 10,000 all families except m4, t2, x1, 10xlarge 40 256 Net only 20,000 m4 only 16xlarge 64 256 - 488 Direct/Net 20,000 r4,m4,i3,x1 only 32xlarge 128 1,952 Direct/Net 20,000 x1 only
  • 20. Cloud Instance Features (AWS) Instance Feature Comment CPU Types Various generations of Intel processors and models: Ivy Bridge, Sandy Bridge, Broadwell, Haswell, Skylake Enhanced Networking Low latency and High throughput networking using SR-IOV PCIe NIC. Native driver (sriov, ena) runs inside the VM and have direct access (DMA) to NIC hardware. In the absence of Enhanced Networking feature, virtualized xen driver is used that is prone to higher latencies. Ephemeral Direct attached Storage Some instance families offer direct attached SSD, HDD and NVMe storage. Direct attached storage is called Ephemeral because storage life span is limited to instance lifespan. Once instance is terminated, ephemeral storage is also lost. NVMe storage is available in I3 family that provide access to storage using native driver (nvme) that runs inside the VM and have direct access (DMA) to storage via SR-IOV. In the absence of SR-IOV extension for storage, SSD and HDD are access via virtualized xen driver that is prone to higher IO latencies. EBS Optimized Network Storage EBS optimized instances have dedicated Network link for accessing EBS network storage. This allows instances to keep storage traffic separate from the application network traffic. Burstable Performance AWS offers burstable cpu, io and network performance to achieve higher than baseline performance for a shorter period of time. Burstable feature is ideal for bursty workloads that require higher (burst) performance for a short period. You pay fraction of price to achieve burst as compared to fixed performance. Amazon instances support performance features to improve compute, IO and network performance
  • 21. Spot Instances (AWS) ● Spare compute capacity in AWS public cloud is sold via bidding system. ● Spot instances get steep discount (upto 90%) compare to on-demand prices ● Spot instances can be taken away at 2 minutes notice whenever AWS needs the capacity back or spot price has increased. ● Shorter run-time jobs or application that is interruptible are good candidates to run on spot instances. ● Spot instance has an option to hibernate when instance is about to terminate due to spot price changes. When capacity is available application resumes where it was paused. ○ Upon hibernation, you pay for storage cost only.
  • 22. Cloud Image (AMI) Ubuntu Base AMI Java GC and thread dump logging Tomcat Application servlet, base server, platform, interface jars for dependent services Atlas Monitoring Optional Apache front end, memcached, non-java apps Healthcheck, status servlets, JMX interface
  • 23. Auto Scaling in Public Cloud (Elasticity) ● Microservice architecture allows each service to be scaled independent of other services ● AWS Auto Scaling Group (ASG) is a group of same type of instances running the same service. ASG can: ○ Scale up/down instances to meet varying demands on the service ○ Replace unhealthy or terminated instance ○ Monitor AWS system and application metrics periodically via Amazon Cloudwatch service to trigger scaling event. ○ Easy to setup via single policy to manage instance capacity via Target Tracking feature: ■ Target Tracking acts like a thermostat that strives to keep the metric close to a desired value. Netflix uses predictive auto scaling policy that scales up early in anticipation of load and scale down slowly to avoid causing resource shortages. It takes into account for public holidays, big public events
  • 24. Bootstrapping a Cloud Instance ● AWS offers metadata service that publishes instance metadata and custom supplied user data and script that can be fetched at instance launch time. ● Application or configuration script can use instance attributes such as: instance-id or type, public hostname, IP address, AMI-id, AZ etc.. to configure instance at launch time. Use cases: ○ Launch an instance and have it register itself to DNS service ○ Launch an instance with “Golden Image” and install additional patches/software on it ○ Run automated Test bed to perform different tests depending on instance type. ● AWS metadata service is hosted at: http://169.254.169.254/latest/meta-data ● AWS Cloudinit executes user supplied data script at the first boot cycle of instance.
  • 25. Cloud Storage (AWS) ● Elastic Block Storage (Network Block Storage): Network storage optimized for IO throughput and low latency. ○ IO1: SSD backed network storage with bounded IO latency (most expensive) ○ GP2: SSD backed network storage for lower IO latency ○ ST1: Magnetic Disk backed storage for Higher IO Throughput ○ SC1: Magnetic Disk backed storage for Moderate IO Throughput (least expensive) ● Ephemeral Storage: High performance direct attached storage. Data is lost on instance termination. Comes in variety of flavors: ○ Magnetic Disk (attached to D2, H1 instance families) ○ SSD (attached to I2, R3 .. instance families) ○ NVMe (attached to i3 instance family) ● EFS (NFS Managed Service): Shared storage that can be accessed concurrently by hundreds of cloud instances spanning across multiple AZ
  • 26. S3 Object Storage (AWS) ● Manages data as objects ● Each object is self identifiable and discoverable by including metadata and globally unique identifier ● Most cost-efficient and scalable method of storing data in the public cloud ● Flat model makes it scalable and searchable even when object count reaches in trillions ● AWS offers API to interface with S3 objects ● Instances with ephemeral storage periodically backs up data to S3 buckets. ● Ability to scale to millions of operations/sec and GBs of throughput Netflix Cloud Native Storage is built around ephemeral instances and storage Cassandra Backup S3 Cassandra Nodes us-east-1c Cassandra Nodes us-east-1d Cassandra Nodes us-east-1e
  • 27. Cloud Networking (AWS) Virtual Private Cloud (VPC): ● Instances are launched into an isolated VPC in a virtual network with Internet Gateway already configured. ● VPC resembles data center with full control on virtual networks. ● Default VPC spans to all AZ with one subnet in each. You are free to define more subnets. ● VPC has CIDR block /16 network (65k IP addresses). Subnet mask of /20 allows 4096 IP per subnet. ● Security is applied at instance (security group) and subnet level (ACL) Public and Private IP Address: ● New instances are assigned randomly generated public and private IP addresses. For non-default VPC, only private IP address is assigned ● Private IP address of instance are mapped to Public IP via NAT. ● Private IP is used inside AWS cloud within the same region. ● Public IP is used for Internet and AWS inter-region traffic
  • 28. Cloud Availability (Failover) Higher availability or service failover require same IP address to be assigned to a newly launched instance after instance termination. Elastic IP (EIP): ● EIP is a permanent public IP address that can be assigned to a running instance in any AZ. ● Masks failure of instance. No delay in DNS propagation due to persistent IP address ● EIP is owned by account. There is small charge for unused EIPs per account. ● Automation using a script that allocate EIP to a running instance Elastic Network Interface (ENI): ● A virtual NIC that can be attached to a running instance in addition to primary NIC (eth0) ● ENI is per subnet and thus require creating ENI for each subnet that you plan an instance to run ● ENI has an associated properties: Private IP, EIP, Security Group etc.. ● When ENI is attached to a new instance, all ENI properties are migrated with it. ● Useful for redirect traffic or configuring a seperate network for administration and backup
  • 29. Cloud Availability ( Elastic Load Balancer) ● Load Balancer (LB) distributes incoming network traffic across group of cloud instances ● LB performs periodic health check and stops sending traffic to unhealthy instance ● AWS ASG (Auto Scaling Group) and LB work together. ASG is responsible for replacing a bad or terminated instance and LB job is to resume traffic to healthy instances. ● LB supports features like: SSL Termination, Sticky Sessions, Idle Connection Timeout, Connection Draining etc. ● Instances behind LB need private IP address only ● Classic LB runs at TCP layer (layer 4) and thus use TCP ports to direct traffic ● Application LB (ALB) makes routing decision at the Application layer (layer 7). ● Unlike Classic LB, one ALB can route traffic to multiple services ● ALB supports HTTP/HTTPs and thus have more context and flexibility in routing traffic ● ALB supports content based routing that allows traffic to be routed based on URL ● ALB supports dynamic port mapping that allows load balance across two containers (Docker) of same service running on the same instance. Without ALB, you may require two instances for load balance the same service.
  • 30. AWS Route 53 - DNS Service ● Self managed DNS service in AWS cloud ● You can register and park your domain name (cloudperf.net ) with route 53 service ● You own any subdomain such as techblog.cloudperf.net. ● Instead of dynamically assign names of cloud instances, give them custom names ● Route53 supports health check and various routing policies: Latency Routing, Failover Routing, Geolocation Routing
  • 31. Containers (OS. Virtualization) ● Lightweight virtualization supported by operating system (kernel) to create isolated user-space instances ● No hypervisor is required! ● Container can run instruction native to CPU without any special interpretation ● Goal is to create application execution environment that mimics standard linux install without requiring a separate kernel
  • 32. Docker Container ● Open source container service, similar to lxc ● Application centric instead of machine centric view ● More nimble than VM due to smaller footprints ● Portable across data centers and public clouds ● Immutable. Changes to container image is lost on termination. ● Support Open standard libraries: libcontainer, libswarm.. ● Containers are created from a read-only template called an image ○ Simple template (Dockerfile or Docker compose) are used for building docker images for single and multi-container applications. ● All required dependencies (code, runtime, system tools and libraries etc.) are baked into the container, thus allow the software to run the same way regardless of environment
  • 33. Docker Image Docker image is built using multiple layers: ● Base: boot file system. Unmounted after container is booted ● rootfs: It can be any Linux distro (Ubuntu, RedHat..). Mounted as read-only root file system ● union mount: Docker uses union mount to add more read-only file systems (called images) on top of root file system. ● Container Image: When a container image is launched, it is mounted as read-write file system This is where application/process inside Docker container run. Docker images are stored in a public or private registry from which they can be downloaded and run on the cluster
  • 34. Container (Docker) Orchestration ● Orchestration framework is required to manage fleet of cloud instances where docker containers can be deployed ● Container Orchestration framework abstracts the infrastructure and make the entire fleet of instances or cluster as a single deployment target. ● Container orchestration typically involves container scheduling, deployment, replication, scaling, monitoring, management, and failover ● Public Cloud container services: ○ Amazon ECS ○ Azure Container Service ○ Google Container Engine ● Popular Container Orchestration framework: ○ Kubernetes ○ Mesos ○ Docker Swarm ○ CoreOS Fleet
  • 35. Cloud Monitoring and Resource Tagging (AWS) CloudWatch ● AWS service for monitoring AWS resource metrics to gain visibility on resource utilization and performance ● Application can also store custom metrics and logs into CloudWatch ● Metrics can be polled to set an alarm or alert when a threshold is met. Tagging ● AWS allows cloud resources to be tagged using a key and a value (optional). ● Allows companies to perform internal tracking of resource usage across departments (sales, marketing) and billing. ● Tags can also be used to identify cloud resources used in prod and test environment ● Cloud resource with tag can be searched and filter to perform an action as a group.
  • 36. Multi-Tier Cloud Security (AWS) ● AWS Security Keys: used for cloud resource provisioning ● Key Pair: public/private key to authenticate ssh/login into cloud instance ● IAM Users/groups: are given limited access to cloud resources by attaching a policy that lists cloud resources that a user/group can provision or access. ● IAM Roles: Allows assigning temporary security credential to application running on cloud instance or mobile device for access to aws services and resources. Example: InstanceProfile, AssumeRole API ● Security Group (SG): Implements security (Firewall) at the cloud instance ● Access Control List (ACL): Implement security at the network level ● AWS CloudTrail Service: Event history of account activity to perform security analysis, resource charge tracking and troubleshooting.
  • 37. Multi-Tier Cloud Security (AWS) - Cont. ● Meets several compliance requirements (financial, healthcare, govt.) ● DDoS Mitigation ● Data encryption on transit (TLS) and at rest for better data privacy ● Highly secure AWS data centers ● Multi-Factor Authentication for privileged accounts. ● Integration with corporate directory using AWS Directory Service to easily migrate directory aware on-premises workloads.