Public Cloud Workshop

Public Cloud Computing
Workshop
Amer Ather
Netflix Cloud Performance Architect

What is a Cloud
● Abstraction of underlying IT. resources
● On-demand resource provisioning via self service layer
● API driven infrastructure
● Cloud is not just virtualization. Virtualization is among many
technologies that cloud uses to manage physical infrastructure.
● Cloud can span across multiple geographical locations.
Cloud capabilities can be set up for public or private access
us-west-2
us-east-1
eu-west-1

Public Cloud Computing
● Cloud computing enables companies or individuals to consume compute
resources like a utility rather than building their own.
● Compute services are hosted on Public Cloud providers (Amazon, Azure,
Google..) infrastructure instead of data centers.

Cloud Computing Benefits
● Elasticity of compute resources
● Pay-Per-Use
● Self Service on-demand Provisioning
● Cloud API and Integration
● Managed Services
● Economy of scale
● Tier pricing model
● Resilience via Availability Zones and Regions
● Give rise to immutable Infrastructure
● No more hardware debugging. Terminate the bad
instance and provision a new one.

Types of Cloud Computing
● Infrastructure-as-a-Service(IaaS)
○ Customers launch VMs (Virtual Machines) in public cloud managed
infrastructure.
○ Customer manages infrastructure via self service interface (web, api, cli) over
the internet. Infrastructure components include: VM, machine Images, DNS,
storage, networking, patching, monitoring, security etc..
● Platform-as-a-Service(PaaS)
○ Hides complexity of managing infrastructure.
○ Cloud provider handles capacity provisioning: VM launch, load balancing,
auto-scaling, patching, monitoring etc..
○ Targeted for developers. Developers simply upload the code and cloud providers
do the rest.
○ Examples: AWS Elastic Beanstalk, Google App Engine

Types of Cloud Computing (cont.)
● Software-as-a-Service(SaaS)
○ Application hosting in the cloud.
○ Customer access services via web interface over the Internet
○ SaaS provider use subscription model
○ Examples: Salesforce.com, Dropbox, Gmail, Flicker ..
● Function-as-a-Service(FaaS)
○ Serverless computing. No infrastructure to maintain or pay.
○ New compute paradigm. Application is built in bite-sized business logic
○ A function is a single purpose block of code performing a single task.
○ Functions run on public cloud infrastructure.
○ You pay for the amount of time the code is running (nearest 100 ms).
○ Functions are ephemeral, they run on-demand in response to an event
○ Examples: AWS Lambda, Google Cloud Function

Public Cloud Managed Services (AWS)
● Cloudformation: Template to model and provision cloud infrastructure
● RDS: Hosted database solution (sql, nosql): mySQL, Oracle, DynamoDB.
● Data Lake: Store structured/unstructured data on S3 that can be used to run ad hoc
queries or ingest into warehouse or hadoop/spark clusters for analytics
● Elastic Cache: Object (memcache) and key-value pair (redis) caching engine
● Amazon ElasticSearch: Server log and full text search for near real time analytics
● RedShift: Data warehousing service
● EMR: On-demand hadoop cluster for big data processing
● Amazon IoT: Connect devices to cloud and use AWS services
● Elastic Container Service: Container (Docker) orchestration in public cloud
● Amazon Lambda: Run functions in response of events from cloud services
● Elastic File System: Managed NFS service
● Amazon Kinesis: Collect and analyze streaming data for real time insight (kafka)
● Amazon SageMaker: Build, train, and deploy machine learning models at cloud scale

Cloud Native Application
● Application written to have cloud in mind
● Stateless and self healing
● Support data sharding
● CI/CD and DevOps
● Red/Black Deployment
● Use microservice or serverless architecture, if possible
● Leverage public cloud API to build new features quickly
● Auto scale and Health check
Some open source projects that scales well in cloud: Hadoop/Spark, Machine Learning, NoSQL,
memcache, redis, Elasticsearch, kafka

Microservices
● It is an alternative to traditional monolithic application architecture
● Loosely coupled service oriented architecture (SOA) with bounded contexts
● Massively scalable due to loose coupling, stateless model and sharded data
● Decomposition of single application into a suite of small services each
implementing different sets of business logic.
● Each service runs independently and interact with open protocol (API)
● HTTP/REST and gRPC are common API used for service interaction.
● Common payloads used for data exchange: JSON, XML, Protocol Buffers
● Forces design of clear interfaces

Microservices (cont.)
● Each service is independently built, deployed, upgrade and scaled.
● Lower learning curve for a new team member due to bounded context
● Services may be written in different languages: java, go, python, nodejs etc.
● Services can be deployed as web container (Tomcat) in a VM or Docker
● Service is free to select any datastore technology (redis, memcache,
elasticsearch, cassandra, mongoDB etc.) that suites its use case
● Stateful cached datastore can be built via replicated ephemeral instances

Monolithic vs. Microservices Architecture

Netflix Microservices Architecture (Netflix OSS)
Spinnaker
DevOps
CI/CD
Tooling
Edda
(Archaius)
Config Mgmt.
Eureka
Prana
Discovery
Zuul
Ribbon
Routing
Hystrix
Atlas
Observability
Ephemeral datastores
Dynomite, Memcached, Priam, Cassandra
Orchestration
Auto-scaling Groups(AWS), Titas (Netflix PaaS using Mesos, Docker), Elastic Container Service (AWS)
Build Environment
Java (majority), Groovy, Scala, Python, Ruby, php, nodejs
Policy Conformance
Simian Army, Chaos Monkey, Conformity Monkey, Janitor Monkey
spigo: open source software that simulates Netflix style microservices and interactions
Microservices with Spring Cloud: Online course on building microservices with Netflix OSS
Deep dive into Netflix Microservices

Monolithic vs. Microservices Architecture (cont.)
Traditional Data Center Architecture Cloud Architecture (microservices)
Monolithic and Centralized Decomposed and decentralized
Design for predictable scalability Design for elastic scale
Relational database Polyglot persistence (mix of data storage engines)
Strong consistency Eventual consistency
Shared dataset Sharded dataset
Serial and synchronized processing Parallel and async processing
Design to avoid failures Design for failure
Infrequent and slower updates Frequent small updates (More features)
Manual management Self-management (DevOps, CI/CD pipeline)
Failures may result in an outage Immutable infrastructure

REST Web API
● A RESTful API is a platform that exposes data as a resource on which to operate
● All client actions to resource are represented by HTTP CRUD methods:
○ POST - Create | GET: Read | PUT: Update | DELETE: Delete
● Response is returned in JSON. HTTP status codes (2xx, 3xx, 4xx, 5xx) are returned with response
● URL is a unique identifier that describes the resource in application.
● A simple client (curl) can be used to invoke REST methods in application
● Each request/response is stateless. Client maintains state and responsible for providing it for server
to fulfill that request
website
Mobile
Partner
Integration
Third Party
Apps
API
GatewayEdge
services
Backend
Services
HTTP Transactions
clients
Well defined interaction with clients
and front end service

Continuous Integration and Deployment (CI/CD)

Server Virtualization - Evolution
Courtesy Brendan Gregg

Cloud Instance
● Virtual Machine (VM) hosted in public cloud is
called Cloud instance
● Hypervisor (xen, kvm) is used for virtualizing
physical machine hardware
● VM or guest is bounded to subset of physical
resources
● Multiple OS. (window, Linux, Solaris) can run
concurrently on the same physical machine
● Intel hardware assisted Virtualization
technologies (VTx, VT-d, VPID, EMT, SR-IOV)
have narrowed the performance gap between
VM and bare-metal

Cloud Instance Families (AWS)
AWS offers wide range of cloud
instance families with varying
processing capabilities
https://aws.amazon.com/ec2/instance-types/
Purchasing options:
● Bare-metal (most expensive)
● Dedicated
● On-demand
● Reserved
● Spot (least expensive)
Instance Family Purpose
T2 Burstable Performance
C5 Compute Optimized
R4 Memory Optimized
I3 Storage Optimized (SSD)
D2 Dense Storage Optimized (HDD)
M5 General Purpose
X1 Large Hardware Configuration
P3 GPU General Purpose
G3 Graphics Intensive
F1 FPGA (custom hardware)

Cloud Instance Types
(AWS)
AWS offers cloud instances with
different hardware configuration
within an instance family
https://aws.amazon.com/ec2/instance-types/
Instance Type vCPU Memory
(GB)
Storage Network
(Mbps)
Comment
nano,micro,small 1 0.5 - 2.0 Net only 300 Burstable, T2 only
medium 2 4 Net only 300 - 700 t2 and m3 only
large 2 3-15 Direct/Net 500-700 All instances
except: D2, M4,
X1
xlarge 4 16-32 Direct/Net 700 - 1000 all families except
x1, p2
2xlarge 8 32-60 Direct/Net 1000 - 2000 all families except
x1,p2
4xlarge 16 30-122 Direct/Net 10,000 all families except
t2, x1, p2
8xlarge 32 60-244 Direct/Net 10,000 all families except
m4, t2, x1,
10xlarge 40 256 Net only 20,000 m4 only
16xlarge 64 256 - 488 Direct/Net 20,000 r4,m4,i3,x1 only
32xlarge 128 1,952 Direct/Net 20,000 x1 only

Cloud Instance Features (AWS)
Instance Feature Comment
CPU Types Various generations of Intel processors and models: Ivy Bridge, Sandy Bridge,
Broadwell, Haswell, Skylake
Enhanced Networking Low latency and High throughput networking using SR-IOV PCIe NIC. Native driver
(sriov, ena) runs inside the VM and have direct access (DMA) to NIC hardware. In the
absence of Enhanced Networking feature, virtualized xen driver is used that is prone to
higher latencies.
Ephemeral
Direct attached Storage
Some instance families offer direct attached SSD, HDD and NVMe storage. Direct
attached storage is called Ephemeral because storage life span is limited to instance
lifespan. Once instance is terminated, ephemeral storage is also lost. NVMe storage is
available in I3 family that provide access to storage using native driver (nvme) that runs
inside the VM and have direct access (DMA) to storage via SR-IOV. In the absence of
SR-IOV extension for storage, SSD and HDD are access via virtualized xen driver that
is prone to higher IO latencies.
EBS Optimized
Network Storage
EBS optimized instances have dedicated Network link for accessing EBS network
storage. This allows instances to keep storage traffic separate from the application
network traffic.
Burstable Performance AWS offers burstable cpu, io and network performance to achieve higher than baseline
performance for a shorter period of time. Burstable feature is ideal for bursty workloads
that require higher (burst) performance for a short period. You pay fraction of price to
achieve burst as compared to fixed performance.
Amazon instances
support
performance
features to
improve compute,
IO and network
performance

Spot Instances (AWS)
● Spare compute capacity in AWS public cloud is sold via bidding system.
● Spot instances get steep discount (upto 90%) compare to on-demand prices
● Spot instances can be taken away at 2 minutes notice whenever AWS needs
the capacity back or spot price has increased.
● Shorter run-time jobs or application that is interruptible are good candidates to
run on spot instances.
● Spot instance has an option to hibernate when instance is about to terminate
due to spot price changes. When capacity is available application resumes
where it was paused.
○ Upon hibernation, you pay for storage cost only.

Cloud Image (AMI)
Ubuntu Base AMI
Java
GC and thread
dump logging
Tomcat
Application
servlet, base
server,
platform,
interface jars
for dependent
services
Atlas Monitoring
Optional Apache
front end,
memcached,
non-java apps
Healthcheck,
status servlets,
JMX interface

Auto Scaling in Public Cloud (Elasticity)
● Microservice architecture allows each service to
be scaled independent of other services
● AWS Auto Scaling Group (ASG) is a group of
same type of instances running the same
service. ASG can:
○ Scale up/down instances to meet varying demands on the
service
○ Replace unhealthy or terminated instance
○ Monitor AWS system and application metrics periodically
via Amazon Cloudwatch service to trigger scaling event.
○ Easy to setup via single policy to manage instance
capacity via Target Tracking feature:
■ Target Tracking acts like a thermostat that strives to
keep the metric close to a desired value.
Netflix uses predictive auto scaling
policy that scales up early in
anticipation of load and scale down
slowly to avoid causing resource
shortages. It takes into account for
public holidays, big public events

Bootstrapping a Cloud Instance
● AWS offers metadata service that publishes instance metadata and custom
supplied user data and script that can be fetched at instance launch time.
● Application or configuration script can use instance attributes such as:
instance-id or type, public hostname, IP address, AMI-id, AZ etc.. to
configure instance at launch time. Use cases:
○ Launch an instance and have it register itself to DNS service
○ Launch an instance with “Golden Image” and install additional patches/software on it
○ Run automated Test bed to perform different tests depending on instance type.
● AWS metadata service is hosted at: http://169.254.169.254/latest/meta-data
● AWS Cloudinit executes user supplied data script at the first boot cycle of
instance.

Cloud Storage (AWS)
● Elastic Block Storage (Network Block Storage): Network storage optimized for
IO throughput and low latency.
○ IO1: SSD backed network storage with bounded IO latency (most expensive)
○ GP2: SSD backed network storage for lower IO latency
○ ST1: Magnetic Disk backed storage for Higher IO Throughput
○ SC1: Magnetic Disk backed storage for Moderate IO Throughput (least expensive)
● Ephemeral Storage: High performance direct attached storage. Data is lost on
instance termination. Comes in variety of flavors:
○ Magnetic Disk (attached to D2, H1 instance families)
○ SSD (attached to I2, R3 .. instance families)
○ NVMe (attached to i3 instance family)
● EFS (NFS Managed Service): Shared storage that can be accessed
concurrently by hundreds of cloud instances spanning across multiple AZ

S3 Object Storage (AWS)
● Manages data as objects
● Each object is self identifiable and
discoverable by including metadata and
globally unique identifier
● Most cost-efficient and scalable method of
storing data in the public cloud
● Flat model makes it scalable and searchable
even when object count reaches in trillions
● AWS offers API to interface with S3 objects
● Instances with ephemeral storage periodically
backs up data to S3 buckets.
● Ability to scale to millions of operations/sec
and GBs of throughput
Netflix Cloud Native Storage is built around
ephemeral instances and storage
Cassandra Backup
S3
Cassandra Nodes
us-east-1c
Cassandra Nodes
us-east-1d
Cassandra Nodes
us-east-1e

Cloud Networking (AWS)
Virtual Private Cloud (VPC):
● Instances are launched into an isolated VPC in a virtual network with
Internet Gateway already configured.
● VPC resembles data center with full control on virtual networks.
● Default VPC spans to all AZ with one subnet in each. You are free to
define more subnets.
● VPC has CIDR block /16 network (65k IP addresses). Subnet mask of
/20 allows 4096 IP per subnet.
● Security is applied at instance (security group) and subnet level (ACL)
Public and Private IP Address:
● New instances are assigned randomly generated public and private IP
addresses. For non-default VPC, only private IP address is assigned
● Private IP address of instance are mapped to Public IP via NAT.
● Private IP is used inside AWS cloud within the same region.
● Public IP is used for Internet and AWS inter-region traffic

Cloud Availability (Failover)
Higher availability or service failover require same IP address to be assigned to a newly launched
instance after instance termination.
Elastic IP (EIP):
● EIP is a permanent public IP address that can be assigned to a running instance in any AZ.
● Masks failure of instance. No delay in DNS propagation due to persistent IP address
● EIP is owned by account. There is small charge for unused EIPs per account.
● Automation using a script that allocate EIP to a running instance
Elastic Network Interface (ENI):
● A virtual NIC that can be attached to a running instance in addition to primary NIC (eth0)
● ENI is per subnet and thus require creating ENI for each subnet that you plan an instance to run
● ENI has an associated properties: Private IP, EIP, Security Group etc..
● When ENI is attached to a new instance, all ENI properties are migrated with it.
● Useful for redirect traffic or configuring a seperate network for administration and backup

Cloud Availability ( Elastic Load Balancer)
● Load Balancer (LB) distributes incoming network traffic across group of cloud instances
● LB performs periodic health check and stops sending traffic to unhealthy instance
● AWS ASG (Auto Scaling Group) and LB work together. ASG is responsible for replacing a bad or
terminated instance and LB job is to resume traffic to healthy instances.
● LB supports features like: SSL Termination, Sticky Sessions, Idle Connection Timeout,
Connection Draining etc.
● Instances behind LB need private IP address only
● Classic LB runs at TCP layer (layer 4) and thus use TCP ports to direct traffic
● Application LB (ALB) makes routing decision at the Application layer (layer 7).
● Unlike Classic LB, one ALB can route traffic to multiple services
● ALB supports HTTP/HTTPs and thus have more context and flexibility in routing traffic
● ALB supports content based routing that allows traffic to be routed based on URL
● ALB supports dynamic port mapping that allows load balance across two containers (Docker) of
same service running on the same instance. Without ALB, you may require two instances for load
balance the same service.

AWS Route 53 - DNS Service
● Self managed DNS service in AWS cloud
● You can register and park your domain name
(cloudperf.net ) with route 53 service
● You own any subdomain such as
techblog.cloudperf.net.
● Instead of dynamically assign names of cloud
instances, give them custom names
● Route53 supports health check and various
routing policies: Latency Routing, Failover
Routing, Geolocation Routing

Containers (OS. Virtualization)
● Lightweight virtualization supported by
operating system (kernel) to create isolated
user-space instances
● No hypervisor is required!
● Container can run instruction native to CPU
without any special interpretation
● Goal is to create application execution
environment that mimics standard linux
install without requiring a separate kernel

Docker Container
● Open source container service, similar to lxc
● Application centric instead of machine centric view
● More nimble than VM due to smaller footprints
● Portable across data centers and public clouds
● Immutable. Changes to container image is lost on termination.
● Support Open standard libraries: libcontainer, libswarm..
● Containers are created from a read-only template called an image
○ Simple template (Dockerfile or Docker compose) are used for building
docker images for single and multi-container applications.
● All required dependencies (code, runtime, system tools and libraries etc.)
are baked into the container, thus allow the software to run the same way
regardless of environment

Docker Image
Docker image is built using multiple layers:
● Base: boot file system. Unmounted after
container is booted
● rootfs: It can be any Linux distro (Ubuntu,
RedHat..). Mounted as read-only root file system
● union mount: Docker uses union mount to add
more read-only file systems (called images) on
top of root file system.
● Container Image: When a container image is
launched, it is mounted as read-write file system
This is where application/process inside Docker
container run.
Docker images are stored in a public or private registry from which they
can be downloaded and run on the cluster

Container (Docker) Orchestration
● Orchestration framework is required to manage fleet of cloud instances
where docker containers can be deployed
● Container Orchestration framework abstracts the infrastructure and make
the entire fleet of instances or cluster as a single deployment target.
● Container orchestration typically involves container scheduling,
deployment, replication, scaling, monitoring, management, and failover
● Public Cloud container services:
○ Amazon ECS
○ Azure Container Service
○ Google Container Engine
● Popular Container Orchestration framework:
○ Kubernetes
○ Mesos
○ Docker Swarm
○ CoreOS Fleet

Cloud Monitoring and Resource Tagging (AWS)
CloudWatch
● AWS service for monitoring AWS resource metrics to gain visibility on resource
utilization and performance
● Application can also store custom metrics and logs into CloudWatch
● Metrics can be polled to set an alarm or alert when a threshold is met.
Tagging
● AWS allows cloud resources to be tagged using a key and a value (optional).
● Allows companies to perform internal tracking of resource usage across departments
(sales, marketing) and billing.
● Tags can also be used to identify cloud resources used in prod and test environment
● Cloud resource with tag can be searched and filter to perform an action as a group.

Multi-Tier Cloud Security (AWS)
● AWS Security Keys: used for cloud resource provisioning
● Key Pair: public/private key to authenticate ssh/login into cloud instance
● IAM Users/groups: are given limited access to cloud resources by attaching a
policy that lists cloud resources that a user/group can provision or access.
● IAM Roles: Allows assigning temporary security credential to application
running on cloud instance or mobile device for access to aws services and
resources. Example: InstanceProfile, AssumeRole API
● Security Group (SG): Implements security (Firewall) at the cloud instance
● Access Control List (ACL): Implement security at the network level
● AWS CloudTrail Service: Event history of account activity to perform security
analysis, resource charge tracking and troubleshooting.

Multi-Tier Cloud Security (AWS) - Cont.
● Meets several compliance requirements (financial, healthcare, govt.)
● DDoS Mitigation
● Data encryption on transit (TLS) and at rest for better data privacy
● Highly secure AWS data centers
● Multi-Factor Authentication for privileged accounts.
● Integration with corporate directory using AWS Directory Service to easily
migrate directory aware on-premises workloads.

Public Cloud Workshop

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Public Cloud Workshop

Semelhante a Public Cloud Workshop (20)

Último

Último (20)

Public Cloud Workshop