SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Distributed Systems: Patterns and Practices 
John Brinnand 
Enterprise Architect: StubHub
Agenda 
● Introduction 
● Why Distributed Systems – what problem do they solve? 
● Types of Distributed Systems 
● Common strategies and patterns in distributed systems 
● Conclusion 
● Questions
What is a distributed system? 
● A distributed system is a software system in which components located on 
networked computers communicate and coordinate their actions by passing 
messages. Wikipedia 
– A Distributed system is an Ecosystem – or a set of systems working together to provide a 
service, functionality or behavior for clients. 
● The behavior is uniform – it appears to come from a single source, but in fact it comes 
from a set of systems interacting to produce that behavior. 
● The components (systems) know of their peers and work together, passing messages 
between each other in order to: 
– service users requests; 
– detect and respond to failures; 
– adapt to changing conditions
Vertical Scaling: Problems 
● What problems do distributed systems solve; why not build bigger and bigger 
machines to address increasing demand? 
● Single points of failure – the bigger they are, the harder they fall 
– When the big system goes down, everything it contains goes down. 
● NOC builds disaster recovery, failover strategies, constant monitoring. 
● Ops becomes failure sensitive, vigilant and risk averse. 
● Elastic demand - How to size system resources for elastic demand? 
– At peak times (Thanksgiving, Christmas, Valentines day, etc) demand 
increases. Hordes of consumers descend upon eCommerce sites 
simultaneously, causing system meltdown. 
– Off-season – usage is bursty. Sometimes steady, sometimes slow and 
sometimes relatively idle. 
● Business Impacts 
– Increased expenditure 
– Failure results in loss of current and future business 
● Loss of customer confidence, 
● Negative Brand impact 
– Competitive edge: newer software features take time to be installed. 
Development is fast, Ops is slow.
Solution: Horizontal Scalability – Adaptive Systems 
● Big systems are made of many smaller systems working together. 
– Individual system – has a single capability. To service a request it delegates to a peer or peers for 
providing the capabilities it does not have. Responses from its peers are processed and presented to 
the user. 
Node 1 Node2 
Node 4 Node 5 
Node 3 
● Horizontal Scalability by itself is not an Adaptive System. 
– So what is an adaptive system? 
Message based 
Network dependent 
Failure Isolation 
Optimized Deployment 
Elastic: on demand 
Service addition 
And removal 
Parallel Development 
High Failure Rates
Solution: Horizontal Scalability – Adaptive Systems 
Node 1 Node2 
Node 4 Node 5 
Node 3 
● Embrace Failure 
– Self Healing: Make the system 
“self-aware”. If one component 
fails (which it will) “spin up” 
another instance. 
Node2 
Admin 
● Respond to demand 
– Increase and decrease capacity to 
meet changes in demand. 
Node 1 Node 1 
Admin 
Node 4 Node 5 
Node 3 
Node 4 
Node 2 
● However – the system is still not fault tolerant 
– The Admin Service is a single point of failure.
Solution: Zookeeper 
Contains a list of the 
ZK servers in the 
cluster. 
● Clients connect to a single server. 
● All Client requests are served from the in-memory 
Broadcast 
messages 
Server 1 
Follower 
Server 1 
Follower 
Server 2 
Leader 
Server 2 
Leader 
Server 3 
Follower 
Server 3 
Follower 
DDaattaa SSttoorree 
Configuration: 
Host: IP and Port 
Client Data 
Configuration: 
Host: IP and Port 
Client Data 

.............. 

.............. 
All Writes 
go to the 
Leader 
CClileienntt CClileienntt CClileienntt CClileienntt CClileienntt 
ZZKK C Clileienntt 
Client 
Anatomy of a 
Client 
data store on a server. 
● Servers send their data to the leader. 
● Leader stores the data in a data store. 
● A Server responds to client only after the 
leader has stored the data. 
Broadcast 
messages 
Server 1 
Leader 
Server 1 
Leader 
Server 2 
Leader 
Server 2 
Leader 
Server 3 
Follower 
Server 3 
Follower 
DDaattaa SSttoorree 
Configuration: 
Host: IP and Port 
Client Data 
Configuration: 
Host: IP and Port 
Client Data 

.............. 

.............. 
All Writes 
go to the 
Leader 
CClileienntt CClileienntt CClileienntt CClileienntt CClileienntt 
ZZKK C Clileienntt 
Client 
Anatomy of a 
Client 
Contains a list of the 
ZK servers in the 
cluster. 
● If a Leader fails, a new leader is elected. 
● Clients reconnect to the next available server 
from their list of available zookeeper servers. 
● The data for each client is loaded into each 
server that services that client.
Patterns 
● Leader and Followers 
– Continuous communication between servers 
● (awareness of the presence or absence of a peer) 
● Leader election – dynamically elect a leader on startup and on failure conditions. 
– Leader manages common data store (which is the source of truth). 
● Common Data Store – single source of data (or state) which is distributed to all servers in the cluster or ensemble. 
● Expectation of Failure 
– Programming model, storage model, messaging model – all have failure recognition and failure recovery 
methodologies built-in. 
Server 1 
Follower 
Server 1 
Follower 
Broadcast 
messages 
Server 2 
Leader 
Server 2 
Leader 
DDaattaa SSttoorree 
Server 3 
Follower 
Server 3 
Follower 
/ 
/app1 
/app1/p_1 /app1/p_2 /app1/p_3 
/ 
/app1 
Server 1 
Follower 
Server 1 
Follower 
Broadcast 
messages 
Server 2 
Leader 
Server 2 
Leader 
DDaattaa SSttoorree 
Server 3 
Follower 
Server 3 
Follower 
/ 
/app1 
/app1/p_1 /app1/p_2 /app1/p_3 
/ 
/app1 
Initial Cluster / Ensemble Leader Failure: 
Restructured ensemble
Pattern: Stateless Applications 
Discovery Service, Load Balancing 
ZOOKEEPER 
ZOOKEEPER 
Service 1 
Leader 
Service 2 
Follower 
Service 3 
Follower 
Service 1 
Leader 
Service 2 
Follower 
Service 3 
Follower 
Service 1 
Follower 
Service 2 
Leader 
Service 3 
Follower 
Client 1 
List of all services: 
Blue: 1, 2, 3 
Light Orange: 1,2,3 
Green: 1,2,3 
Internal load balancer: 
Round robin request to 
each Service. 
ZOOKEEPER 
ZOOKEEPER 
Service 1 Service 2 
Follower 
Service 3 
Leader 
Service 1 
Leader 
Service 2 
Follower 
Service 3 
Follower 
Service 1 Service 2 Service 3 
Client 1 
List of all services: 
Blue: 2, 3 
Light Orange: 1,3 
Green: 1,2,3 
Internal load balancer: 
Round robin request to 
each Service. 
Cluster configuration - after Cluster configuration: initial deployment. a failure condition. 
● ZK Async notification: all services that are part of a “group” receive 
asynchronous notifications when any member of that group goes 
down. 
● ZK Leader Election: when a leader of a group goes down, 
zookeeper will elect a new leader. 
● Discovery Service built on Zookeeper notifies the client of the new 
cluster configuration. 
● Shared Data: All members of a group will receive data 
(configuration, events) published by any other member of the 
group.
Snapshot data - Problem 
Server 1 
Follower 
Server 1 
Follower 
Server 2 
Leader 
Server 2 
Leader 
Server 3 
Follower 
Server 3 
Follower 
/ 
/app1 
/app1/p_1 /app1/p_2 
/ 
/app1 
/ 
/app1/p_1 /app1/p_2 
/app1/p_3 
/app1/p_3 
/ 
/app1 
/app1/p_1 /app1/p_2 
/app1/p_3 
/app1 Client 1 
/app1/p_1 /app1/p_2 
/app1/p_3 
Periodic updates / 
Snapshots 
CAP Theorem: 
 Consistency – all nodes see the same data at the same 
time 
 Availability – a guarantee that every request receives a 
response about whether it was successful or failed 
 Partition Tolerance - the system continues to operate 
despite arbitrary message loss or failure of part of the 
system. 
Consistent: to synchronize the data, the system will have to 
be unavailable for a period of time even though it is fully 
operational. 
Availability: if the system is always available and is operating 
in spite of message loss and component failure, then the data 
will be inconsistent at any given point in time. 
Partition Tolerance: if the system continues to function when 
parts of it fail, then it can be available but the data within it 
cannot be consistent. 
So if Availability and Partition Tolerance are favored, how can a 
client get accurate or viable data?
Pattern: Snapshot data – Quorum Management 
Server 1 
Follower 
Server 1 
Follower 
Quorum 
Manager 
Server 2 
Leader 
Server 2 
Leader 
Server 3 
Follower 
Server 3 
Follower 
/ 
/app1 
/app1/p_1 /app1/p_2 
/ 
/app1 
/ 
/app1/p_1 /app1/p_2 
/app1/p_3 
/app1/p_3 
/ 
/app1 
/app1/p_1 /app1/p_2 
/app1/p_3 
Client 1 
/app1 
/app1/p_1 /app1/p_2 
/app1/p_3 
Periodic updates / 
Snapshots 
Quorum Manager 
 A quorum manager issues a request to a number of 
systems, takes the results, compares the timestamps 
(or vector clock) and returns the most up to date data 
back to the client. 
 A Quorum manager can exist in the cluster – in each 
component – or external to the system as a service. 
According to Wikipedia, Quorum is the minimum number of members of a deliberative body necessary to conduct the 
business of that group. Ordinarily, this is a majority of the people expected to be there, although many bodies may 
have a lower or higher quorum.
Pattern: Data Lookup and Replication: 
HDFS 
NNaammeeNNooddee 
DDaatataNNooddee 1 1 DDaatataNNooddee 2 2 DDaatataNNooddee 3 3 
Read or Write File / Data 
1 2 
http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html 
1 
3 
DDaatataNNooddee 4 4 DDaatataNNooddee 5 5 
1 
3 
2 2 
4 3 
4 4 
5 
5 
5 
6 
6 
6 
Client 1 
/user/my-company/file-part-0, r:3, 1,3, 
/user/my-company/fie-part-1, r:3, 2,4 
/user/my-company/file_part-2, r3, 5,6 
Example: WebHdfs which 
first contacts the 
NameNode to find out the 
data nodes to write to. Or 
to find out which data 
nodes to read from.
Consistent Hashing – Replicated Data 
Cassandra 
A Consistent 
B 
C 
There are two write modes: 
Hash based 
off namespace 
and key 
C 
A 
B 
Find the node on the ring with a range 
of keys into which the current key falls. 
Write the data to that node. 
 Quorum write: blocks until quorum is reached 
 Async write: sends request to any node. That node will push the data to appropriate nodes but return to client immediately 
If the node is down, then write to another node with a hint saying where it should be written to. Harvester [goes through] every 15 min goes through and 
find hints and moves the data to the appropriate node
Consistent Hashing – Replicated Data 
Cassandra 
A 
C 
B 
If the node that was hosting B's data 
goes down. The node next to it on the 
ring will take its data, from Bs replicated 
data and it will become the host for Bs 
data. 
A 
B 
C 
B 
If a node is added to a partition, it will 
share some of the data that exists in 
that partition. The data it is responsible 
for is based on its hashed position in the 
ring. This results in a division of the 
keys among the two nodes. Interestingly 
, it promotes load balancing as well – 
since now the load is shared between 
two data nodes. 
A 
B 
C 
Initial state of the cluster Note that all 
data (A, B, C) is replicated. If it were not 
then a nodes failure will result in data 
loss.
Conclusion 
● Vertical scaling is expensive and error prone. 
● Horizontal scaling is elastic, responsive, fault tolerant and self-healing. 
● Distributed Systems affect all aspects of software development. 
– Programming models 
– Testing 
– Deployment 
– Maintenance 
● There are best practices and patterns for designing your distributed system. 
● Many existing systems (Cassandra, Hadoop, Solr, Riak, Netflix platform) have are 
implementations of these patterns. Look under the hood. Use the patterns to “roll your own”.
Questions

Mais conteĂșdo relacionado

Mais procurados

Process Migration in Heterogeneous Systems
Process Migration in Heterogeneous SystemsProcess Migration in Heterogeneous Systems
Process Migration in Heterogeneous Systemsijsrd.com
 
Distributed process and scheduling
Distributed process and scheduling Distributed process and scheduling
Distributed process and scheduling SHATHAN
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating SystemsDr Sandeep Kumar Poonia
 
Distributed System
Distributed System Distributed System
Distributed System Nitesh Saitwal
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algosAkhil Sharma
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed SystemsRitu Ranjan Shrivastwa
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsSachin Chauhan
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed SystemsRicha Singh
 
Fault Tolerant and Distributed System
Fault Tolerant and Distributed SystemFault Tolerant and Distributed System
Fault Tolerant and Distributed Systemsreenivas1591
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukAndrii Vozniuk
 
Fault tolerant presentation
Fault tolerant presentationFault tolerant presentation
Fault tolerant presentationskadyan1
 
Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Govt. P.G. College Dharamshala
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rulesOleg Tsal-Tsalko
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systemsguest61205606
 
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryDistributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryGyanmanjari Institute Of Technology
 

Mais procurados (20)

Process Migration in Heterogeneous Systems
Process Migration in Heterogeneous SystemsProcess Migration in Heterogeneous Systems
Process Migration in Heterogeneous Systems
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Distributed process and scheduling
Distributed process and scheduling Distributed process and scheduling
Distributed process and scheduling
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
 
Distributed System
Distributed System Distributed System
Distributed System
 
dos mutual exclusion algos
dos mutual exclusion algosdos mutual exclusion algos
dos mutual exclusion algos
 
Processor allocation in Distributed Systems
Processor allocation in Distributed SystemsProcessor allocation in Distributed Systems
Processor allocation in Distributed Systems
 
Distributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit ProtocolsDistributed Transactions(flat and nested) and Atomic Commit Protocols
Distributed Transactions(flat and nested) and Atomic Commit Protocols
 
Load balancing in Distributed Systems
Load balancing in Distributed SystemsLoad balancing in Distributed Systems
Load balancing in Distributed Systems
 
Fault Tolerant and Distributed System
Fault Tolerant and Distributed SystemFault Tolerant and Distributed System
Fault Tolerant and Distributed System
 
Scheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii VozniukScheduling in distributed systems - Andrii Vozniuk
Scheduling in distributed systems - Andrii Vozniuk
 
Fault tolerant presentation
Fault tolerant presentationFault tolerant presentation
Fault tolerant presentation
 
Resource management
Resource managementResource management
Resource management
 
Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...Analysis of mutual exclusion algorithms with the significance and need of ele...
Analysis of mutual exclusion algorithms with the significance and need of ele...
 
Distributed systems and scalability rules
Distributed systems and scalability rulesDistributed systems and scalability rules
Distributed systems and scalability rules
 
10. resource management
10. resource management10. resource management
10. resource management
 
Communication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed SystemsCommunication And Synchronization In Distributed Systems
Communication And Synchronization In Distributed Systems
 
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & RecoveryDistributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
 
Distributed Operating System_2
Distributed Operating System_2Distributed Operating System_2
Distributed Operating System_2
 
data replication
data replicationdata replication
data replication
 

Semelhante a SVCC-2014

Monitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersMonitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersPrince JabaKumar
 
Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Joydeep Banik Roy
 
Client Server Model and Distributed Computing
Client Server Model and Distributed ComputingClient Server Model and Distributed Computing
Client Server Model and Distributed ComputingAbhishek Jaisingh
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouMariaDB plc
 
Architectural patterns part 1
Architectural patterns part 1Architectural patterns part 1
Architectural patterns part 1assinha
 
Heart of the SwarmKit: Store, Topology & Object Model
Heart of the SwarmKit: Store, Topology & Object ModelHeart of the SwarmKit: Store, Topology & Object Model
Heart of the SwarmKit: Store, Topology & Object ModelDocker, Inc.
 
Preventing serversickness
Preventing serversicknessPreventing serversickness
Preventing serversicknessGabriella Davis
 
Implementing Vulnerability Management
Implementing Vulnerability Management Implementing Vulnerability Management
Implementing Vulnerability Management Argyle Executive Forum
 
IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM MQ: Managing Workloads, Scaling and Availability with MQ ClustersIBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM MQ: Managing Workloads, Scaling and Availability with MQ ClustersDavid Ware
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Database Engineering and Operations at Yahoo
Database Engineering and Operations at YahooDatabase Engineering and Operations at Yahoo
Database Engineering and Operations at YahooAshwin Nellore
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonataAnh Le
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitinbloomreacheng
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing OverviewMaycon Viana Bordin
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 
18 philbe replication stanford99
18 philbe replication stanford9918 philbe replication stanford99
18 philbe replication stanford99ashish61_scs
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices PatternsDimosthenis Botsaris
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices Patternsarconsis
 

Semelhante a SVCC-2014 (20)

Monitoring Clusters and Load Balancers
Monitoring Clusters and Load BalancersMonitoring Clusters and Load Balancers
Monitoring Clusters and Load Balancers
 
Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!Winter is coming? Not if ZooKeeper is there!
Winter is coming? Not if ZooKeeper is there!
 
Client Server Model and Distributed Computing
Client Server Model and Distributed ComputingClient Server Model and Distributed Computing
Client Server Model and Distributed Computing
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
M|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for YouM|18 Choosing the Right High Availability Strategy for You
M|18 Choosing the Right High Availability Strategy for You
 
Architectural patterns part 1
Architectural patterns part 1Architectural patterns part 1
Architectural patterns part 1
 
Heart of the SwarmKit: Store, Topology & Object Model
Heart of the SwarmKit: Store, Topology & Object ModelHeart of the SwarmKit: Store, Topology & Object Model
Heart of the SwarmKit: Store, Topology & Object Model
 
Preventing serversickness
Preventing serversicknessPreventing serversickness
Preventing serversickness
 
Implementing Vulnerability Management
Implementing Vulnerability Management Implementing Vulnerability Management
Implementing Vulnerability Management
 
IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM MQ: Managing Workloads, Scaling and Availability with MQ ClustersIBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters
IBM MQ: Managing Workloads, Scaling and Availability with MQ Clusters
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Database Engineering and Operations at Yahoo
Database Engineering and Operations at YahooDatabase Engineering and Operations at Yahoo
Database Engineering and Operations at Yahoo
 
Zookeeper big sonata
Zookeeper  big sonataZookeeper  big sonata
Zookeeper big sonata
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
18 philbe replication stanford99
18 philbe replication stanford9918 philbe replication stanford99
18 philbe replication stanford99
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices Patterns
 
Introduction to Microservices Patterns
Introduction to Microservices PatternsIntroduction to Microservices Patterns
Introduction to Microservices Patterns
 
Distruted applications
Distruted applicationsDistruted applications
Distruted applications
 

Último

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžDelhi Call girls
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in sowetomasabamasaba
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile EnvironmentVictorSzoltysek
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 

Último (20)

Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 

SVCC-2014

  • 1. Distributed Systems: Patterns and Practices John Brinnand Enterprise Architect: StubHub
  • 2. Agenda ● Introduction ● Why Distributed Systems – what problem do they solve? ● Types of Distributed Systems ● Common strategies and patterns in distributed systems ● Conclusion ● Questions
  • 3. What is a distributed system? ● A distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages. Wikipedia – A Distributed system is an Ecosystem – or a set of systems working together to provide a service, functionality or behavior for clients. ● The behavior is uniform – it appears to come from a single source, but in fact it comes from a set of systems interacting to produce that behavior. ● The components (systems) know of their peers and work together, passing messages between each other in order to: – service users requests; – detect and respond to failures; – adapt to changing conditions
  • 4. Vertical Scaling: Problems ● What problems do distributed systems solve; why not build bigger and bigger machines to address increasing demand? ● Single points of failure – the bigger they are, the harder they fall – When the big system goes down, everything it contains goes down. ● NOC builds disaster recovery, failover strategies, constant monitoring. ● Ops becomes failure sensitive, vigilant and risk averse. ● Elastic demand - How to size system resources for elastic demand? – At peak times (Thanksgiving, Christmas, Valentines day, etc) demand increases. Hordes of consumers descend upon eCommerce sites simultaneously, causing system meltdown. – Off-season – usage is bursty. Sometimes steady, sometimes slow and sometimes relatively idle. ● Business Impacts – Increased expenditure – Failure results in loss of current and future business ● Loss of customer confidence, ● Negative Brand impact – Competitive edge: newer software features take time to be installed. Development is fast, Ops is slow.
  • 5. Solution: Horizontal Scalability – Adaptive Systems ● Big systems are made of many smaller systems working together. – Individual system – has a single capability. To service a request it delegates to a peer or peers for providing the capabilities it does not have. Responses from its peers are processed and presented to the user. Node 1 Node2 Node 4 Node 5 Node 3 ● Horizontal Scalability by itself is not an Adaptive System. – So what is an adaptive system? Message based Network dependent Failure Isolation Optimized Deployment Elastic: on demand Service addition And removal Parallel Development High Failure Rates
  • 6. Solution: Horizontal Scalability – Adaptive Systems Node 1 Node2 Node 4 Node 5 Node 3 ● Embrace Failure – Self Healing: Make the system “self-aware”. If one component fails (which it will) “spin up” another instance. Node2 Admin ● Respond to demand – Increase and decrease capacity to meet changes in demand. Node 1 Node 1 Admin Node 4 Node 5 Node 3 Node 4 Node 2 ● However – the system is still not fault tolerant – The Admin Service is a single point of failure.
  • 7. Solution: Zookeeper Contains a list of the ZK servers in the cluster. ● Clients connect to a single server. ● All Client requests are served from the in-memory Broadcast messages Server 1 Follower Server 1 Follower Server 2 Leader Server 2 Leader Server 3 Follower Server 3 Follower DDaattaa SSttoorree Configuration: Host: IP and Port Client Data Configuration: Host: IP and Port Client Data 
.............. 
.............. All Writes go to the Leader CClileienntt CClileienntt CClileienntt CClileienntt CClileienntt ZZKK C Clileienntt Client Anatomy of a Client data store on a server. ● Servers send their data to the leader. ● Leader stores the data in a data store. ● A Server responds to client only after the leader has stored the data. Broadcast messages Server 1 Leader Server 1 Leader Server 2 Leader Server 2 Leader Server 3 Follower Server 3 Follower DDaattaa SSttoorree Configuration: Host: IP and Port Client Data Configuration: Host: IP and Port Client Data 
.............. 
.............. All Writes go to the Leader CClileienntt CClileienntt CClileienntt CClileienntt CClileienntt ZZKK C Clileienntt Client Anatomy of a Client Contains a list of the ZK servers in the cluster. ● If a Leader fails, a new leader is elected. ● Clients reconnect to the next available server from their list of available zookeeper servers. ● The data for each client is loaded into each server that services that client.
  • 8. Patterns ● Leader and Followers – Continuous communication between servers ● (awareness of the presence or absence of a peer) ● Leader election – dynamically elect a leader on startup and on failure conditions. – Leader manages common data store (which is the source of truth). ● Common Data Store – single source of data (or state) which is distributed to all servers in the cluster or ensemble. ● Expectation of Failure – Programming model, storage model, messaging model – all have failure recognition and failure recovery methodologies built-in. Server 1 Follower Server 1 Follower Broadcast messages Server 2 Leader Server 2 Leader DDaattaa SSttoorree Server 3 Follower Server 3 Follower / /app1 /app1/p_1 /app1/p_2 /app1/p_3 / /app1 Server 1 Follower Server 1 Follower Broadcast messages Server 2 Leader Server 2 Leader DDaattaa SSttoorree Server 3 Follower Server 3 Follower / /app1 /app1/p_1 /app1/p_2 /app1/p_3 / /app1 Initial Cluster / Ensemble Leader Failure: Restructured ensemble
  • 9. Pattern: Stateless Applications Discovery Service, Load Balancing ZOOKEEPER ZOOKEEPER Service 1 Leader Service 2 Follower Service 3 Follower Service 1 Leader Service 2 Follower Service 3 Follower Service 1 Follower Service 2 Leader Service 3 Follower Client 1 List of all services: Blue: 1, 2, 3 Light Orange: 1,2,3 Green: 1,2,3 Internal load balancer: Round robin request to each Service. ZOOKEEPER ZOOKEEPER Service 1 Service 2 Follower Service 3 Leader Service 1 Leader Service 2 Follower Service 3 Follower Service 1 Service 2 Service 3 Client 1 List of all services: Blue: 2, 3 Light Orange: 1,3 Green: 1,2,3 Internal load balancer: Round robin request to each Service. Cluster configuration - after Cluster configuration: initial deployment. a failure condition. ● ZK Async notification: all services that are part of a “group” receive asynchronous notifications when any member of that group goes down. ● ZK Leader Election: when a leader of a group goes down, zookeeper will elect a new leader. ● Discovery Service built on Zookeeper notifies the client of the new cluster configuration. ● Shared Data: All members of a group will receive data (configuration, events) published by any other member of the group.
  • 10. Snapshot data - Problem Server 1 Follower Server 1 Follower Server 2 Leader Server 2 Leader Server 3 Follower Server 3 Follower / /app1 /app1/p_1 /app1/p_2 / /app1 / /app1/p_1 /app1/p_2 /app1/p_3 /app1/p_3 / /app1 /app1/p_1 /app1/p_2 /app1/p_3 /app1 Client 1 /app1/p_1 /app1/p_2 /app1/p_3 Periodic updates / Snapshots CAP Theorem:  Consistency – all nodes see the same data at the same time  Availability – a guarantee that every request receives a response about whether it was successful or failed  Partition Tolerance - the system continues to operate despite arbitrary message loss or failure of part of the system. Consistent: to synchronize the data, the system will have to be unavailable for a period of time even though it is fully operational. Availability: if the system is always available and is operating in spite of message loss and component failure, then the data will be inconsistent at any given point in time. Partition Tolerance: if the system continues to function when parts of it fail, then it can be available but the data within it cannot be consistent. So if Availability and Partition Tolerance are favored, how can a client get accurate or viable data?
  • 11. Pattern: Snapshot data – Quorum Management Server 1 Follower Server 1 Follower Quorum Manager Server 2 Leader Server 2 Leader Server 3 Follower Server 3 Follower / /app1 /app1/p_1 /app1/p_2 / /app1 / /app1/p_1 /app1/p_2 /app1/p_3 /app1/p_3 / /app1 /app1/p_1 /app1/p_2 /app1/p_3 Client 1 /app1 /app1/p_1 /app1/p_2 /app1/p_3 Periodic updates / Snapshots Quorum Manager  A quorum manager issues a request to a number of systems, takes the results, compares the timestamps (or vector clock) and returns the most up to date data back to the client.  A Quorum manager can exist in the cluster – in each component – or external to the system as a service. According to Wikipedia, Quorum is the minimum number of members of a deliberative body necessary to conduct the business of that group. Ordinarily, this is a majority of the people expected to be there, although many bodies may have a lower or higher quorum.
  • 12. Pattern: Data Lookup and Replication: HDFS NNaammeeNNooddee DDaatataNNooddee 1 1 DDaatataNNooddee 2 2 DDaatataNNooddee 3 3 Read or Write File / Data 1 2 http://hadoop.apache.org/docs/r1.2.1/hdfs_design.html 1 3 DDaatataNNooddee 4 4 DDaatataNNooddee 5 5 1 3 2 2 4 3 4 4 5 5 5 6 6 6 Client 1 /user/my-company/file-part-0, r:3, 1,3, /user/my-company/fie-part-1, r:3, 2,4 /user/my-company/file_part-2, r3, 5,6 Example: WebHdfs which first contacts the NameNode to find out the data nodes to write to. Or to find out which data nodes to read from.
  • 13. Consistent Hashing – Replicated Data Cassandra A Consistent B C There are two write modes: Hash based off namespace and key C A B Find the node on the ring with a range of keys into which the current key falls. Write the data to that node.  Quorum write: blocks until quorum is reached  Async write: sends request to any node. That node will push the data to appropriate nodes but return to client immediately If the node is down, then write to another node with a hint saying where it should be written to. Harvester [goes through] every 15 min goes through and find hints and moves the data to the appropriate node
  • 14. Consistent Hashing – Replicated Data Cassandra A C B If the node that was hosting B's data goes down. The node next to it on the ring will take its data, from Bs replicated data and it will become the host for Bs data. A B C B If a node is added to a partition, it will share some of the data that exists in that partition. The data it is responsible for is based on its hashed position in the ring. This results in a division of the keys among the two nodes. Interestingly , it promotes load balancing as well – since now the load is shared between two data nodes. A B C Initial state of the cluster Note that all data (A, B, C) is replicated. If it were not then a nodes failure will result in data loss.
  • 15. Conclusion ● Vertical scaling is expensive and error prone. ● Horizontal scaling is elastic, responsive, fault tolerant and self-healing. ● Distributed Systems affect all aspects of software development. – Programming models – Testing – Deployment – Maintenance ● There are best practices and patterns for designing your distributed system. ● Many existing systems (Cassandra, Hadoop, Solr, Riak, Netflix platform) have are implementations of these patterns. Look under the hood. Use the patterns to “roll your own”.