3. Distributed Systems
This document aims to introduce the concept of Cloud Native Computing
and the problem context it attempts to address.
Application development and infrastructure platforms are undergoing significant
changes driven by need for rapid application releases, efficiency, hyper scale
and fault tolerance.
Traditional monolithic architectures are proving to be extremely expensive to
develop, deploy and operate as scale, availability and reliability requirements grow.
Distributed Systems are rapidly gaining momentum in enterprise environments as
its advantage in addressing issues with traditional architectures are becoming clear.
While Distributed Systems are very attractive, it requires considerable and
sophisticated systems engineering capability to manage and operate. Enterprises
are continuously evaluating tools and platforms that can simplify the development
and management of distributed systems.
Cloud Native Computing defines a technology, operations and organizational
model that will enable enterprises to operate in this new environment effectively
and efficiently.
Monolithic Systems and Scale Challenges
Monolithic Systems combine disparate application functionality and concerns
into a single large component with deep and brittle dependencies among them.
As scale and size of applications and data grow, it is increasingly difficult to
manage the availability, performance, reliability, scalability, manageability and
cost with traditional monolithic architectures.
Monolithic architecture present the following challenges
• Long development cycles
• Lock-in to particular technology stack without the ability to adopt the best
tool for the job
• Considerable impedance to rapid releases and deployments
• Technology stack insufficiency could result in re-write the entire application
• Scaling logic is usually application specific and is expensive to develop
and maintain
Distributed Systems attempt to solve the challenges posed by Monolithic
Architectures by architecturally decomposing various functional and non-functional
concerns interacting using consistent interfaces.
Background
& Overview
“A distributed system is a software system in which components located on
networked computers communicate and coordinate their actions by passing messages”
– Wikipedia
ISHI Systems | Introduction to Cloud Native Computing | 2
4. Key characteristics of a distributed application
• Availability: Distributed Systems are designed for high availability by eliminating
a single point of failure, allow rapid recovery in the event of partial system failures
and graceful degradation when problems occur.
• Performance: Distributed Systems achieve high performance by allowing for
parallelization of user request or task processing across a cluster of compute
nodes.
• Consistency: Consistency ensures that data written to a system is valid and up
to date to all distributed components reading the data. Distributed Systems
follow the CAP theorem with a focus on eventual consistency and is built around
this possibility.
• Scalability: Distributed Systems are designed to be horizontally scalable with
ability to elastically scale selectively the distributed component under stress.
• Manageability: Distributed systems introduce considerably more moving parts
leading to multiple points and scenarios of failure. Important considerations for
effective operational management are the visibility to monitor and diagnose
problems as they occur, the ease of rolling out partial or complete updates to
the system and the ability to recover successfully from failures.
• Cost: Cost efficiency in a distributed system is typically multi-dimensional – it
makes efficient use of hardware resources by scaling up and down only as
required, and supports automation thereby reducing or simplifying manual
operational support for deployment and operations.
The industry is in the process of slowly transitioning from monolithic systems to
distributed systems driven by scale and availability requirements. Industry leaders
like Facebook, Google and LinkedIn are driving the transition of core systems like
data stores, messaging, searching, caching, logging and storage systems to a
distributed architecture.
Following the same lead, organizations are also transforming their business
applications into a distributed architecture. Netflix, Twitter are good examples
of organizations running their entire application portfolio in a distributed manner.
ISHI Systems | Introduction to Cloud Native Computing | 3
Distributed Systems Adoption
5. ISHI Systems | Introduction to Cloud Native Computing | 4
Cloud Platform
for Distributed
Systems
Cloud Native Computing is a comprehensive cloud platform strategy to design,
develop and manage distributed applications.
Cloud Native Computing defines technology, architecture, organizational structure,
application delivery and operations driven by the need for operational efficiency,
hyper scale, fault tolerance and rapid application releases. Cloud Native Computing
aims to achieve 10x-50x improvement in IT efficiency driven by the following key
characteristics:
• End to End Operations Automation
• High Developer Velocity
• Increased Utilization
• Automatic Fault Tolerance
• Auto Scaling
Key tenets of Cloud Native Computing
• Distributed – Designed for distributed applciations
• Multi-tenant – Run heterogeneous applications on the same platform
improving efficiency
• Fault Tolerant – Built around inevitable hardware and software failures
without impacting applications
• Scalable – scales to 10s of 1000s of nodes
• Automation Ready – To enable a small operational footprint and the lend
itself to easy management everything is automated including environment
provisioning, scale up, scale down, recovery and rollback.
• Continuous Deployment – Ability to do end to end deployment triggered by
any relevant event
• Efficient – Efficient utilization of resources leveraging bin packing and
resource pooling
• Secure – Enables secure, isolated application and data environments
Distributed
Cloud Native Computing platforms are built for distributed applications and
address key cross-cutting concerns relevant to distributed applications such
as scalability, fault tolerance and efficiency.
Multi - Tenant
The platform is fundamentally multi-tenant. Applications with diverse resource
requirements are deployed on the same environment to take advantage of
resource synergies that exist.
For example a long running web application coexists with batch analytics
services tapping into available resources when demand for the web
application is low.
6. ISHI Systems | Introduction to Cloud Native Computing | 5
Fault Tolerant
Hardware and software failures are inevitable while operating large distributed
applications. The platform takes care of handling failures and ensures that
applications consistently return to a stable state.
The platform constantly monitors for failures and ensures the required number
of instances are available at all times. If it detects failures, it ensures that
another instance of the component is brought up instantaneously and seamlessly.
Scalable
Cloud Native Computing platforms are designed to support 1000’s of application
and are required to scale to 10’s of thousands of nodes without any impact in
performance.
The platform support automatic scaling within applications or across the
platform based on key metrics like average memory and CPU utilization.
Automation Ready
Cloud Native Computing platforms enable full automation by providing API hooks
into the platform. This allows developers to automate key actions, monitor events,
make changes and respond without manual intervention.
Continuous Deployment
Cloud Native Computing platforms along with automated testing enables
continuous and automated deployment of entire applications or application
components without any disruption of service. The platform provides capability
to perform automated deployment based on development events and roll back
on failure.
Efficient
Cloud Native Computing platforms are lightweight and enables effective
management and utilization of resources. It allows fine grained utilization
of resources including memory, CPU, disks and network. It also allows for
elastic growth and ramp down based on actual usage.
Secure
The platform has programmatic security built into it around key principles of
data and process isolation, auditing and continuous monitoring, automated
detection and remediation.
7. ISHI Systems | Introduction to Cloud Native Computing | 6
Containers
Container technology forms the packaging, delivery and operating layer for
application components. Containers can work directly on Host OS while
maintaining operational and resource isolation thereby eliminating traditional
virtualization needs. It also encapsulates all dependencies for application
components thereby simplifying the Host OS layer.
Distributed Application Platform
The platform provides the management and orchestration layer for your
applications and handles resource management, fault tolerance, scalability,
management, monitoring and security.
Technology
Components
of Cloud Native
Computing
Cloud Operating System
Operating System for distributed application environments have to operate in
environments with 1000s of nodes with high uptimes. This requires an OS
which is container friendly, lightweight, secure, always on and highly manageable.
8. Organizational
Prespective
Cloud Native Computing approach impacts technology as well as operational
and cultural aspects of organizations. So it is critical that the success of a
Cloud Native Computing approach is measured appropriately.
To drive a successful transformation strategy, it is critical to make key
operational, organizational changes along with technology adoption.
There are key organizational changes that are relevant for effective Cloud
Native Computing adoption.
ISHI Systems | Introduction to Cloud Native Computing | 7
Distributed Systems Expertise
Deep organizational understanding of distributed systems including architectural
and design patterns, scaling and fault tolerance concepts, operational
complexities, networking and security concerns in distributed systems is key
to adopting a Cloud Native Computing strategy.
Integrated DevOps & Development Teams
Automation is a key aspect of the Cloud Native Computing strategy and has to
be integrated deep into the development, testing and release processes. It
requires teams to identify automation requirements relevant to the application,
integration with the platform, understand key operational events and tooling
that will aid diagnostics and troubleshooting, including detailed logging and
monitoring.
An integrated team also provides application developers with an operational
perspective which will inform them of the architecture and design of the system.
Platform Engineering
Platform is a robust, engineering substrate that encapsulates and handles
critical cross cutting concerns thereby allowing products and services to be
launched faster and managed better.
9. As a Cloud Native Computing organization it is key to develop a platform
engineering mindset to actively try to define, develop, operate and improve the
core application platform. This requires strong systems, networking, storage
and distributed systems understanding to drive the features, capabilities and
stability of the platform.
Platform engineering approach avoids rethinking core system features every
time an application is developed. It also helps automation by providing API
hooks and toolsets into the platform to deploy, monitor, manage and operate
applications.
ISHI Systems | Introduction to Cloud Native Computing | 8
Platform Drivers
The following are key drivers for a successful platform approach in an organization
• Management commitment and funding - Clear, long term roadmap around
hiring, retaining, training
• Driven by business and technology strategy for products and services -
Cost and mode of operations for products and critical technology choices
are key inputs
• Driven by an independent team with strong systems skills and full stack
awareness - Not afraid to use the best tools, enhance it, invent new ones
when old ones are not sufficient
• Development and DevOps teams as customers for the platform
10. ISHI Systems | Introduction to Cloud Native Computing | 9
Cloud Native Computing is a core foundation of next generation applications
and businesses. It is transforming how products are developed, deployed and
operated.
Cloud Native Computing defines a comprehensive strategy to develop, deploy
and operate cost effective, robust, scalable and reliable applications. It identifies
a core set of cloud technology platforms and practices that are key to operating
in a cloud native model.
Cloud Native Computing also defines organizational capabilities that are key in
transitioning to a cloud native model. Distributed Systems Architecture, DevOps
and Platform Engineering are key disciplines for organizations transitioning to
this model.
Summary
11. ISHI Systems | Introduction to Cloud Native Computing | 10
Saju Thomas is the practice lead for Cloud Native Computing services at
ISHI Systems. He has 15 years of experience in developing, deploying and
operating high performance distributed systems. He has held a variety of roles
within ISHI Systems in Strategy, Product Management and Application
Architecture. He has a B.S in Computer Science from Madras University.
About the Author
About ISHI Systems: Deep Expertise and Production Experience
About Us
Over years we have developed a keen understanding of the evolution of
enterprise application platforms including opportunities and challenges
they present. We have invested in understanding architectures, performance,
use-cases, roadmap and automation of these application platforms. With this
as the background, we promote Cloud Native Computing as a game changer
for enterprises. It allows enterprises to achieve efficiencies and scale that only
Twitter, Facebook and leading webscale companies have achieved. We have
helped customers migrate to Cloud Native Computing models to achieve
significant value by improving utilization, developing and operating highly
scalable and reliable applications.
We are part of the Cloud Native Computing Foundation (www.cncf.io) and
work with technology and user communities to drive the direction of key
technologies based on our experience and customer feedback.
US | UK | INDIA | SWITZERLAND
Website: www.ishisystems.com
Email: Sales@ishisystems.com
Phone: 1.201.942.3923
Ishi Systems Inc.
Harborside Financial Center Plaza 5,
Suite 1400, Jersey City, NJ 07311
Contact Us
12. ISHI Systems | Introduction to Cloud Native Computing | 11
Glossary Cloud Native Computing Cloud-Native Computing features systems
that are container packaged, dynamically
managed and micro-services oriented
Apache Mesos Mesos is a cluster manager which is modeled
as a distributed systems kernel. The Mesos
kernel runs on every machine and provides
applications (e.g., Hadoop, Spark, Kafka,
Elastic Search) with API’s for resource
management and scheduling across entire
datacenter and cloud environments.
Mesosphere DCOS It’s a commercial product built around an
Apache Mesos core to provide enterprise
features like management and monitoring,
security etc.
Kubernetes Kubernetes is an open source orchestration
system for Docker containers. It handles
scheduling onto nodes in a compute cluster
and actively manages workloads to ensure
that their state matches the users declared
intentions. Using the concepts of "labels"
and "pods", it groups the containers which
make up an application into logical units for
easy management and discovery.
CoreOS CoreOS redefines the operating system into
a smaller, more minimal Linux distribution.
Traditional distros package unused software
that leads to dependency conflicts and
needlessly increases the attack surface.
CoreOS contains a small base and allows
you to have full control over all dependencies
through the use of containers. Think of them
as your package manager
13. ISHI Systems | Introduction to Cloud Native Computing | 12
Cloud Native Computing
Foundation
The Cloud Native Computing Foundation
will create and drive the adoption of a new
set of common container technologies
informed by technical merit and end user
value, and inspired by Internet-scale
computing.
The community will advance the state-of-
the-art for building cloud native applications
and services. Historically only a small number
of companies who have been willing to make
significant investments in development and
operations have been able to benefit from
this model of computing. It has been out of
reach for the average developer. We aspire
to make the same approach that solved
challenging scalability and efficiency problems
for internet companies available to all
developers.
CAP Theorem In theoretical computer science, the CAP
theorem, also known as Brewer's theorem,
states that it is impossible for a distributed
computer system to simultaneously provide
all three of the following guarantees:
• Consistency (all nodes see the same
data at the same time)
• Availability (a guarantee that every
request receives a response about
whether it succeeded or failed)
• Partition tolerance (the system continues
to operate despite arbitrary partitioning
due to network failures)