O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
© 2016 Mesosphere, Inc. All Rights Reserved.
ERLANG &
CONTAINER
NETWORKING
@SARGUN
1
Sargun Dhillon, August 2016
© 2016 Mesosphere, Inc. All Rights Reserved.
WHO AM I?
2
© 2016 Mesosphere, Inc. All Rights Reserved.
AGENDA
3
• A brief history of how we got here
• How we’re trying to make it b...
© 2016 Mesosphere, Inc. All Rights Reserved.
A BRIEF
HISTORY OF
NETWORKS
IN THE DC
4
© 2016 Mesosphere, Inc. All Rights Reserved.
ORGANIZATIONS CIRCA 2007
5
•Before DevOps was first heard
•Clear differentiati...
© 2016 Mesosphere, Inc. All Rights Reserved.
SOFTWARE CIRCA 2007
6
•Different services glued together via CORBA, XML-RPC, S...
© 2016 Mesosphere, Inc. All Rights Reserved. 7
SaaS continued to grow at
an incredible rate
© 2016 Mesosphere, Inc. All Rights Reserved. 8
There became a race to ship faster
© 2016 Mesosphere, Inc. All Rights Reserved. 9
We kept the software alive
By feeding it
With Sysadmins
© 2016 Mesosphere, Inc. All Rights Reserved. 10
We kept the machines alive
By feeding them
With Blood
© 2016 Mesosphere, Inc. All Rights Reserved. 11
This wasn’t working
© 2016 Mesosphere, Inc. All Rights Reserved.
ORGANIZATIONS 2008+
12
•We began seeing a gradual shift in the industry where...
© 2016 Mesosphere, Inc. All Rights Reserved.
SOFTWARE CIRCA 2008+
13
•Popularization of Open Source tooling to automate mu...
© 2016 Mesosphere, Inc. All Rights Reserved.
CIRCA 2011
14
•Much of what’s been happening for the past half-decade hits ne...
© 2016 Mesosphere, Inc. All Rights Reserved.
CIRCA 2013
15
• Docker becomes instant hit and brings containers to the foref...
© 2016 Mesosphere, Inc. All Rights Reserved. 16
Everything was changing
© 2016 Mesosphere, Inc. All Rights Reserved. 17
Why?
© 2016 Mesosphere, Inc. All Rights Reserved. 18
Business Value
© 2016 Mesosphere, Inc. All Rights Reserved. 19
© 2016 Mesosphere, Inc. All Rights Reserved.
BENEFITS
20
•Reduction in cost of goods sold
•Smaller engineer to server rati...
© 2016 Mesosphere, Inc. All Rights Reserved. 21
But at what cost?
© 2016 Mesosphere, Inc. All Rights Reserved. 22
Complexity
© 2016 Mesosphere, Inc. All Rights Reserved.
OLD WORLD
23
© 2016 Mesosphere, Inc. All Rights Reserved.
NEW WORLD
24
© 2016 Mesosphere, Inc. All Rights Reserved.
DIVING DEEPER
25
© 2016 Mesosphere, Inc. All Rights Reserved. 26
It was time for something completely
new
© 2016 Mesosphere, Inc. All Rights Reserved.
ENTER
DC/OS
27
© 2016 Mesosphere, Inc. All Rights Reserved.
CONTAINERS
WITH
BATTERIES
INCLUDED
28
• Includes everything you need to move ...
© 2016 Mesosphere, Inc. All Rights Reserved. 29
In Your Datacenter
© 2016 Mesosphere, Inc. All Rights Reserved. 30
As a Black Box
© 2016 Mesosphere, Inc. All Rights Reserved. 31
But, our story starts:
~November 2015
(~8 months ago)
© 2016 Mesosphere, Inc. All Rights Reserved. 32
Networking Features
© 2016 Mesosphere, Inc. All Rights Reserved. 33
Service Discovery
(Mesos-DNS)
© 2016 Mesosphere, Inc. All Rights Reserved. 34
…Ish
© 2016 Mesosphere, Inc. All Rights Reserved. 35
DNS SRV Based Service Discovery
© 2016 Mesosphere, Inc. All Rights Reserved. 36
Everyone has DNS right?
© 2016 Mesosphere, Inc. All Rights Reserved. 37
And GLibc even has a bug open for it!
© 2016 Mesosphere, Inc. All Rights Reserved. 38
…Opened in 2005
© 2016 Mesosphere, Inc. All Rights Reserved. 39
We needed a solution that actually
worked
© 2016 Mesosphere, Inc. All Rights Reserved. 40
So, we performed an OODA loop
1.Observe
2. Orient
3. Decide
4. Act
© 2016 Mesosphere, Inc. All Rights Reserved.
OBSERVE: SERVICE DISCOVERY
41
• Existing Dynamic Service Discovery Solutions:...
© 2016 Mesosphere, Inc. All Rights Reserved.
OBSERVE: NETWORKING
42
• Everybody assumes IP per application instance
• Ever...
© 2016 Mesosphere, Inc. All Rights Reserved.
ORIENT:
DC/OS
NETWORKING
CORE TENANTS
43
•DC/OS must be agnostic to the under...
© 2016 Mesosphere, Inc. All Rights Reserved.
DECIDE
44
What do we want to build?
• Load Balancing
• Seamless Service Disco...
© 2016 Mesosphere, Inc. All Rights Reserved.
DC/OS ARCHITECTURE (~NOV ’15): POLYGLOT MICROSERVICES
45
© 2016 Mesosphere, Inc. All Rights Reserved. 46
What Did We Build?
© 2016 Mesosphere, Inc. All Rights Reserved.
ACT:
DEVELOPED
NEW ERLANG
SERVICES
47
• Service Discovery
• Navstar
• Spartan...
© 2016 Mesosphere, Inc. All Rights Reserved.
DC/OS ARCHITECTURE (TODAY): POLYGLOT MICROSERVICES
48
© 2016 Mesosphere, Inc. All Rights Reserved.
CODE METRICS
49
~14,000 LoC
118 Modules
7 Repos
1.5 Engineers
0.5 Consultants
© 2016 Mesosphere, Inc. All Rights Reserved. 50
DNS:
Spartan
© 2016 Mesosphere, Inc. All Rights Reserved. 51
HA DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 52
/etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4
Normal DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 53
Ideal DNS Request
© 2016 Mesosphere, Inc. All Rights Reserved. 54
Ideal DNS Request
© 2016 Mesosphere, Inc. All Rights Reserved. 55
But, sometimes things go wrong
© 2016 Mesosphere, Inc. All Rights Reserved. 56
Not so Ideal DNS Request
© 2016 Mesosphere, Inc. All Rights Reserved. 57
Wait at least 1 second
© 2016 Mesosphere, Inc. All Rights Reserved. 58
Not so Ideal DNS Request
© 2016 Mesosphere, Inc. All Rights Reserved. 59
Wait at least 1 second
© 2016 Mesosphere, Inc. All Rights Reserved. 60
Not so Ideal DNS Request
© 2016 Mesosphere, Inc. All Rights Reserved. 61
Wat?
© 2016 Mesosphere, Inc. All Rights Reserved. 62
Spartan DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 63
Spartan DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 64
Spartan DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 65
Spartan Controls Timeouts
© 2016 Mesosphere, Inc. All Rights Reserved. 66
Edge DNS
© 2016 Mesosphere, Inc. All Rights Reserved. 67
© 2016 Mesosphere, Inc. All Rights Reserved.
SPARTAN:
BENEFITS
68
•Lowers Average Latency
•Can control timeouts to sub-sec...
© 2016 Mesosphere, Inc. All Rights Reserved. 69
Load Balancing:
Minuteman
© 2016 Mesosphere, Inc. All Rights Reserved. 70
The Players
© 2016 Mesosphere, Inc. All Rights Reserved. 71
Setting it up
© 2016 Mesosphere, Inc. All Rights Reserved. 72
Intercepting the
connection
© 2016 Mesosphere, Inc. All Rights Reserved. 73
Remap
© 2016 Mesosphere, Inc. All Rights Reserved. 74
Success
© 2016 Mesosphere, Inc. All Rights Reserved.
MINUTEMAN:
BENEFITS
75
•Appearance of a fixed-load balancer
•Fully distribute...
© 2016 Mesosphere, Inc. All Rights Reserved. 76
How do we tie it all together?
© 2016 Mesosphere, Inc. All Rights Reserved.
GLOBAL STATE
77
• Load Balancer Task Mapping
• DNS Zones
• Virtual Network Ro...
© 2016 Mesosphere, Inc. All Rights Reserved. 78
Computers
© 2016 Mesosphere, Inc. All Rights Reserved. 79
Constant Churn
© 2016 Mesosphere, Inc. All Rights Reserved. 80
Sometimes
They Break
© 2016 Mesosphere, Inc. All Rights Reserved. 81
Sometimes
Many Break
© 2016 Mesosphere, Inc. All Rights Reserved. 82
Sometimes
You don’t know
© 2016 Mesosphere, Inc. All Rights Reserved.
HOW DO YOU
DO
SIGNALING?
83
© 2016 Mesosphere, Inc. All Rights Reserved. 84
Nobody ever got fired for using
Zookeeper
© 2016 Mesosphere, Inc. All Rights Reserved. 85
We know about Zookeeper
© 2016 Mesosphere, Inc. All Rights Reserved. 86
It works…usually
© 2016 Mesosphere, Inc. All Rights Reserved. 87
How else can we do this?
© 2016 Mesosphere, Inc. All Rights Reserved. 88
Naively
(See: Distributed Erlang)
© 2016 Mesosphere, Inc. All Rights Reserved. 89
Massive Amount of Information
© 2016 Mesosphere, Inc. All Rights Reserved. 90
Who else has solved this?
© 2016 Mesosphere, Inc. All Rights Reserved. 91
Academia
© 2016 Mesosphere, Inc. All Rights Reserved. 92
Control Plane:
Lashup
© 2016 Mesosphere, Inc. All Rights Reserved. 93
How do we scale the naive approach?
© 2016 Mesosphere, Inc. All Rights Reserved. 94
All we need is a sparse, connected graph
© 2016 Mesosphere, Inc. All Rights Reserved. 95
But how?
© 2016 Mesosphere, Inc. All Rights Reserved. 96
Enter: HyParView
Builds a connected graph (overlay), where the degree is <...
© 2016 Mesosphere, Inc. All Rights Reserved. 97
Boot Time
© 2016 Mesosphere, Inc. All Rights Reserved. 98
© 2016 Mesosphere, Inc. All Rights Reserved. 99
© 2016 Mesosphere, Inc. All Rights Reserved. 100
© 2016 Mesosphere, Inc. All Rights Reserved. 101
© 2016 Mesosphere, Inc. All Rights Reserved. 102
© 2016 Mesosphere, Inc. All Rights Reserved. 103
© 2016 Mesosphere, Inc. All Rights Reserved. 104
© 2016 Mesosphere, Inc. All Rights Reserved. 105
© 2016 Mesosphere, Inc. All Rights Reserved. 106
© 2016 Mesosphere, Inc. All Rights Reserved. 107
© 2016 Mesosphere, Inc. All Rights Reserved. 108
© 2016 Mesosphere, Inc. All Rights Reserved. 109
© 2016 Mesosphere, Inc. All Rights Reserved. 110
© 2016 Mesosphere, Inc. All Rights Reserved. 111
Connected Graph
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 112
Constant Adaptive
Health Checks*
*Borrowed from SWIM, Gossip Style Failur...
© 2016 Mesosphere, Inc. All Rights Reserved. 113
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 114
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 115
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 116
Dealing with Failure
Hyparview
© 2016 Mesosphere, Inc. All Rights Reserved. 117
Routing*
*Borrowed from Dijkstra, OSPF, IS-IS
© 2016 Mesosphere, Inc. All Rights Reserved. 118
© 2016 Mesosphere, Inc. All Rights Reserved. 119
© 2016 Mesosphere, Inc. All Rights Reserved. 120
Just Run Dijkstra*
*BFS / DFS
© 2016 Mesosphere, Inc. All Rights Reserved. 121
Minimum Spanning Tree, From Sender A
© 2016 Mesosphere, Inc. All Rights Reserved. 122
A Wild Pokemon Appears!*
*CRDT papers
Database
© 2016 Mesosphere, Inc. All Rights Reserved. 123
CRDTS: Semilattices
“Time”
© 2016 Mesosphere, Inc. All Rights Reserved. 124
CRDT KV Store
© 2016 Mesosphere, Inc. All Rights Reserved. 125
“Demo”
© 2016 Mesosphere, Inc. All Rights Reserved. 126
“Demo”
(foo@3c075477e55e)10> nodes().
[baz@3c075477e55e,bar@3c075477e55e]...
© 2016 Mesosphere, Inc. All Rights Reserved. 127
“Demo”
(foo@3c075477e55e)9> lashup_kv:value([erlang_users]).
[]
(bar@3c07...
© 2016 Mesosphere, Inc. All Rights Reserved. 128
“Demo”
(baz@3c075477e55e)10>
net_kernel:allow([foo@3c075477e55e]).
ok
(ba...
© 2016 Mesosphere, Inc. All Rights Reserved. 129
“Demo”
© 2016 Mesosphere, Inc. All Rights Reserved. 130
“Demo”
(bar@3c075477e55e)11>
lashup_kv:request_op([erlang_users], {update...
© 2016 Mesosphere, Inc. All Rights Reserved. 131
“Demo”
(foo@3c075477e55e)12> lashup_kv:value([erlang_users]).
[{{number,r...
© 2016 Mesosphere, Inc. All Rights Reserved. 132
“Demo”
(foo@3c075477e55e)23> erlang:set_cookie(baz@3c075477e55e,
fake).
t...
© 2016 Mesosphere, Inc. All Rights Reserved. 133
“Demo”
© 2016 Mesosphere, Inc. All Rights Reserved. 134
“Demo”
(foo@3c075477e55e)29> lashup_kv:request_op([erlang_users],
{update...
© 2016 Mesosphere, Inc. All Rights Reserved. 135
“Demo”
© 2016 Mesosphere, Inc. All Rights Reserved. 136
“Demo”
(foo@3c075477e55e)1> lashup_kv:value([erlang_users]).
[{{number,ri...
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP KV
137
•Datatypes:
•Maps
•Composable
•Sets
•Flags
•Counters
•Registers...
© 2016 Mesosphere, Inc. All Rights Reserved. 138
Does Lashup provide Business Value?
© 2016 Mesosphere, Inc. All Rights Reserved. 139
Prior to Lashup
© 2016 Mesosphere, Inc. All Rights Reserved. 140
After Lashup
© 2016 Mesosphere, Inc. All Rights Reserved.
PROJECT
LASHUP
141
•A novel distributed systems SDK that provides:
•Membershi...
© 2016 Mesosphere, Inc. All Rights Reserved. 142
So,
Why Erlang/OTP?
© 2016 Mesosphere, Inc. All Rights Reserved. 143
Why Not?
(Go)
© 2016 Mesosphere, Inc. All Rights Reserved. 144
The First Version
of Minuteman
Was In
Golang
© 2016 Mesosphere, Inc. All Rights Reserved. 145
Stand on the Shoulders of Giants
© 2016 Mesosphere, Inc. All Rights Reserved. 146
What’s a good language with a healthy
set of distributed systems, and
net...
© 2016 Mesosphere, Inc. All Rights Reserved. 147
What’s a good language with a healthy
set of distributed systems, and
net...
© 2016 Mesosphere, Inc. All Rights Reserved. 148
Erlang
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP KV
NEEDED A
DURABLE STORE
ANSWER:
MNESIA
149
•ACID, Transactional
•20+...
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP KV
NEEDED A
CRDT LIBRARY
ANSWER:
RIAK_DT
150
•Riak’s CRDT Library
•Use...
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP NEEDED
SECURE
DISTRIBUTION
ANSWER:
DISTERL
151
•SSL with mutual authen...
© 2016 Mesosphere, Inc. All Rights Reserved.
SPARTAN
NEEDED
A DNS SERVER
ANSWER:
ERL-DNS
152
•Runs DNSimple
•Used in produ...
© 2016 Mesosphere, Inc. All Rights Reserved.
MINUTEMAN
NEEDED
A KERNEL API
LIBRARY
ANSWER:
GEN_NETLINK
153
•Liberal licens...
© 2016 Mesosphere, Inc. All Rights Reserved.
LASHUP
NEEDED A
GRAPH LIBRARY
ANSWER:
DIGRAPH
154
•Built-in Library to Erlang...
© 2016 Mesosphere, Inc. All Rights Reserved. 155
So, we built our own.
© 2016 Mesosphere, Inc. All Rights Reserved. 156
On Testing…
© 2016 Mesosphere, Inc. All Rights Reserved. 157
We verified lashup_gm_route against
digraph
© 2016 Mesosphere, Inc. All Rights Reserved. 158
Property-Based Testing:
PropEr
© 2016 Mesosphere, Inc. All Rights Reserved. 159
Automatically Generate Command Set
CSet ={add_node(e), add_node(b),
add_e...
© 2016 Mesosphere, Inc. All Rights Reserved. 160
Verify Equality
Digraph lashup_gm
add_node(e) add_node(e)
Check Equivalen...
© 2016 Mesosphere, Inc. All Rights Reserved. 161
…Tens of Thousands of Times
© 2016 Mesosphere, Inc. All Rights Reserved. 162
Common Test Makes Integration Testing
A Breeze
© 2016 Mesosphere, Inc. All Rights Reserved.
COMMON TEST:
ON EACH
CHECK-IN
(LASHUP)
163
•Spins up 25 virtual nodes
•Tortur...
© 2016 Mesosphere, Inc. All Rights Reserved.
TAKE-AWAYS
164
•The container ecosystem is in is early stages
•Erlang is suit...
© 2016 Mesosphere, Inc. All Rights Reserved.
GET IN TOUCH
IF YOU’RE
INTERESTED
(P.S. WE’RE HIRING)
165
© 2016 Mesosphere, Inc. All Rights Reserved.
ERLANG &
CONTAINER
NETWORKING
@SARGUN
166
Próximos SlideShares
Carregando em…5
×

Erlang containers

1.561 visualizações

Publicada em

A talk on Container networking, and Distributed Systems in Erlang

Publicada em: Software
  • Seja o primeiro a comentar

Erlang containers

  1. 1. © 2016 Mesosphere, Inc. All Rights Reserved. ERLANG & CONTAINER NETWORKING @SARGUN 1 Sargun Dhillon, August 2016
  2. 2. © 2016 Mesosphere, Inc. All Rights Reserved. WHO AM I? 2
  3. 3. © 2016 Mesosphere, Inc. All Rights Reserved. AGENDA 3 • A brief history of how we got here • How we’re trying to make it better • Some systems we’ve built to enable us to better ship • Why Erlang helped us get there
  4. 4. © 2016 Mesosphere, Inc. All Rights Reserved. A BRIEF HISTORY OF NETWORKS IN THE DC 4
  5. 5. © 2016 Mesosphere, Inc. All Rights Reserved. ORGANIZATIONS CIRCA 2007 5 •Before DevOps was first heard •Clear differentiation of ownership •The datacenter was owned by a the NOC •Deployment of services was done by sysadmins in the operations group •Developers operated without access to production •Production deployments gated by QA, Operations
  6. 6. © 2016 Mesosphere, Inc. All Rights Reserved. SOFTWARE CIRCA 2007 6 •Different services glued together via CORBA, XML-RPC, SOAP •No one was really consciously doing microservices •Networks were static, giant layer 2 domains •Load Balancing provided by hardware •Firewall provided by hardware •Everyone ran their own datacenter •EC2 in its infancy, only a year prior has the term “Cloud” began to become popular •Systems statically partitioned
  7. 7. © 2016 Mesosphere, Inc. All Rights Reserved. 7 SaaS continued to grow at an incredible rate
  8. 8. © 2016 Mesosphere, Inc. All Rights Reserved. 8 There became a race to ship faster
  9. 9. © 2016 Mesosphere, Inc. All Rights Reserved. 9 We kept the software alive By feeding it With Sysadmins
  10. 10. © 2016 Mesosphere, Inc. All Rights Reserved. 10 We kept the machines alive By feeding them With Blood
  11. 11. © 2016 Mesosphere, Inc. All Rights Reserved. 11 This wasn’t working
  12. 12. © 2016 Mesosphere, Inc. All Rights Reserved. ORGANIZATIONS 2008+ 12 •We began seeing a gradual shift in the industry where lines between QA, Dev, and Ops were blurring •Devops term coined in 2008, first DevOpsDay in 2009 •Gradual adoption of the cloud, fewer organizations owning their own datacenters •Either networking was outsourced to the cloud, or typically remained in a small internal organization •Needed to reduce ratio of operators to servers
  13. 13. © 2016 Mesosphere, Inc. All Rights Reserved. SOFTWARE CIRCA 2008+ 13 •Popularization of Open Source tooling to automate much of traditional operations, and QA •Jenkins / Hudson •Puppet / Chef •Capistrano •Popularization of stacks requiring with more complex operational requirements •Nutch / Hadoop •NoSQLs •Still statically partitioned machines •Networks still sacred territory
  14. 14. © 2016 Mesosphere, Inc. All Rights Reserved. CIRCA 2011 14 •Much of what’s been happening for the past half-decade hits networking •Much of this falls under the term “SDN” (Software Defined Networking) or “NFV” (Network Function Virtualization) •Hastened by the adoption of VMs in the enterprise in the hype cycle •Openflow promises to fix everything •Major adoptions of the cloud by startups as well as enterprise •Virtualization begins to become mainstream as a mechanism of consolidating workloads •The invention of the “private cloud” •DotCloud / Docker funded by Y-combinator a year earlier •Term “Microservice” coined
  15. 15. © 2016 Mesosphere, Inc. All Rights Reserved. CIRCA 2013 15 • Docker becomes instant hit and brings containers to the forefront • Dynamic partitioning begins to make in-roads • Google releases Omega paper • Apache Aurora open sourced • Microservice counts explode, demanding collocation of workloads for efficiency • Mesosphere Founded • Site Reliability Engineering begins to popularize and further blur the lines between Dev, Ops, and QA
  16. 16. © 2016 Mesosphere, Inc. All Rights Reserved. 16 Everything was changing
  17. 17. © 2016 Mesosphere, Inc. All Rights Reserved. 17 Why?
  18. 18. © 2016 Mesosphere, Inc. All Rights Reserved. 18 Business Value
  19. 19. © 2016 Mesosphere, Inc. All Rights Reserved. 19
  20. 20. © 2016 Mesosphere, Inc. All Rights Reserved. BENEFITS 20 •Reduction in cost of goods sold •Smaller engineer to server ratio •Linear, or super linear growth rate of engineering team to servers is unsustainable •Smaller engineer to capability ratio, where capability includes: •Features •Throughput •Better User Experience •Better availability •Quicker release to features
  21. 21. © 2016 Mesosphere, Inc. All Rights Reserved. 21 But at what cost?
  22. 22. © 2016 Mesosphere, Inc. All Rights Reserved. 22 Complexity
  23. 23. © 2016 Mesosphere, Inc. All Rights Reserved. OLD WORLD 23
  24. 24. © 2016 Mesosphere, Inc. All Rights Reserved. NEW WORLD 24
  25. 25. © 2016 Mesosphere, Inc. All Rights Reserved. DIVING DEEPER 25
  26. 26. © 2016 Mesosphere, Inc. All Rights Reserved. 26 It was time for something completely new
  27. 27. © 2016 Mesosphere, Inc. All Rights Reserved. ENTER DC/OS 27
  28. 28. © 2016 Mesosphere, Inc. All Rights Reserved. CONTAINERS WITH BATTERIES INCLUDED 28 • Includes everything you need to move away from 2007 • Networking • Security • Application deployment, and orchestration • Stateful services
  29. 29. © 2016 Mesosphere, Inc. All Rights Reserved. 29 In Your Datacenter
  30. 30. © 2016 Mesosphere, Inc. All Rights Reserved. 30 As a Black Box
  31. 31. © 2016 Mesosphere, Inc. All Rights Reserved. 31 But, our story starts: ~November 2015 (~8 months ago)
  32. 32. © 2016 Mesosphere, Inc. All Rights Reserved. 32 Networking Features
  33. 33. © 2016 Mesosphere, Inc. All Rights Reserved. 33 Service Discovery (Mesos-DNS)
  34. 34. © 2016 Mesosphere, Inc. All Rights Reserved. 34 …Ish
  35. 35. © 2016 Mesosphere, Inc. All Rights Reserved. 35 DNS SRV Based Service Discovery
  36. 36. © 2016 Mesosphere, Inc. All Rights Reserved. 36 Everyone has DNS right?
  37. 37. © 2016 Mesosphere, Inc. All Rights Reserved. 37 And GLibc even has a bug open for it!
  38. 38. © 2016 Mesosphere, Inc. All Rights Reserved. 38 …Opened in 2005
  39. 39. © 2016 Mesosphere, Inc. All Rights Reserved. 39 We needed a solution that actually worked
  40. 40. © 2016 Mesosphere, Inc. All Rights Reserved. 40 So, we performed an OODA loop 1.Observe 2. Orient 3. Decide 4. Act
  41. 41. © 2016 Mesosphere, Inc. All Rights Reserved. OBSERVE: SERVICE DISCOVERY 41 • Existing Dynamic Service Discovery Solutions: • Etcd • Finagle + Zookeeper • Consul • Existing Static Service Discovery Solutions: • Amazon ELB • Hardware Load Balancers • Service Discovery is an afterthought Gathering our data about the field
  42. 42. © 2016 Mesosphere, Inc. All Rights Reserved. OBSERVE: NETWORKING 42 • Everybody assumes IP per application instance • Everybody assumes reliable DNS • Some people want to be fast • Some people want security • Nobody wants to edit application code • Nobody wants to talk to their network engineer Gathering our data about the field
  43. 43. © 2016 Mesosphere, Inc. All Rights Reserved. ORIENT: DC/OS NETWORKING CORE TENANTS 43 •DC/OS must be agnostic to the underlying environment •AWS / Azure / GCE / Softlayer as the lowest common denominators •DC/OS should require no to minimal changes to the code in order to work •DC/OS should provide similar services to existing environments •Fixed load balancers •Security •IP/Container •We do not want to require a change in organization procedures •We want to be secure
  44. 44. © 2016 Mesosphere, Inc. All Rights Reserved. DECIDE 44 What do we want to build? • Load Balancing • Seamless Service Discovery • Reliable DNS • External cluster Access • Metrics and Discoverability • IP Per Container
  45. 45. © 2016 Mesosphere, Inc. All Rights Reserved. DC/OS ARCHITECTURE (~NOV ’15): POLYGLOT MICROSERVICES 45
  46. 46. © 2016 Mesosphere, Inc. All Rights Reserved. 46 What Did We Build?
  47. 47. © 2016 Mesosphere, Inc. All Rights Reserved. ACT: DEVELOPED NEW ERLANG SERVICES 47 • Service Discovery • Navstar • Spartan • Load Balancing • Minuteman • Networking API • Fishladder • Control Plane • Lashup
  48. 48. © 2016 Mesosphere, Inc. All Rights Reserved. DC/OS ARCHITECTURE (TODAY): POLYGLOT MICROSERVICES 48
  49. 49. © 2016 Mesosphere, Inc. All Rights Reserved. CODE METRICS 49 ~14,000 LoC 118 Modules 7 Repos 1.5 Engineers 0.5 Consultants
  50. 50. © 2016 Mesosphere, Inc. All Rights Reserved. 50 DNS: Spartan
  51. 51. © 2016 Mesosphere, Inc. All Rights Reserved. 51 HA DNS
  52. 52. © 2016 Mesosphere, Inc. All Rights Reserved. 52 /etc/resolv.conf nameserver 8.8.8.8 nameserver 8.8.4.4 Normal DNS
  53. 53. © 2016 Mesosphere, Inc. All Rights Reserved. 53 Ideal DNS Request
  54. 54. © 2016 Mesosphere, Inc. All Rights Reserved. 54 Ideal DNS Request
  55. 55. © 2016 Mesosphere, Inc. All Rights Reserved. 55 But, sometimes things go wrong
  56. 56. © 2016 Mesosphere, Inc. All Rights Reserved. 56 Not so Ideal DNS Request
  57. 57. © 2016 Mesosphere, Inc. All Rights Reserved. 57 Wait at least 1 second
  58. 58. © 2016 Mesosphere, Inc. All Rights Reserved. 58 Not so Ideal DNS Request
  59. 59. © 2016 Mesosphere, Inc. All Rights Reserved. 59 Wait at least 1 second
  60. 60. © 2016 Mesosphere, Inc. All Rights Reserved. 60 Not so Ideal DNS Request
  61. 61. © 2016 Mesosphere, Inc. All Rights Reserved. 61 Wat?
  62. 62. © 2016 Mesosphere, Inc. All Rights Reserved. 62 Spartan DNS
  63. 63. © 2016 Mesosphere, Inc. All Rights Reserved. 63 Spartan DNS
  64. 64. © 2016 Mesosphere, Inc. All Rights Reserved. 64 Spartan DNS
  65. 65. © 2016 Mesosphere, Inc. All Rights Reserved. 65 Spartan Controls Timeouts
  66. 66. © 2016 Mesosphere, Inc. All Rights Reserved. 66 Edge DNS
  67. 67. © 2016 Mesosphere, Inc. All Rights Reserved. 67
  68. 68. © 2016 Mesosphere, Inc. All Rights Reserved. SPARTAN: BENEFITS 68 •Lowers Average Latency •Can control timeouts to sub-second •Many DNS queries don’t need to go off-node
  69. 69. © 2016 Mesosphere, Inc. All Rights Reserved. 69 Load Balancing: Minuteman
  70. 70. © 2016 Mesosphere, Inc. All Rights Reserved. 70 The Players
  71. 71. © 2016 Mesosphere, Inc. All Rights Reserved. 71 Setting it up
  72. 72. © 2016 Mesosphere, Inc. All Rights Reserved. 72 Intercepting the connection
  73. 73. © 2016 Mesosphere, Inc. All Rights Reserved. 73 Remap
  74. 74. © 2016 Mesosphere, Inc. All Rights Reserved. 74 Success
  75. 75. © 2016 Mesosphere, Inc. All Rights Reserved. MINUTEMAN: BENEFITS 75 •Appearance of a fixed-load balancer •Fully distributed •Other than first packet, the entire lifetime is handled in kernel space
  76. 76. © 2016 Mesosphere, Inc. All Rights Reserved. 76 How do we tie it all together?
  77. 77. © 2016 Mesosphere, Inc. All Rights Reserved. GLOBAL STATE 77 • Load Balancer Task Mapping • DNS Zones • Virtual Network Routing Tables • Reachability • Security ACLs
  78. 78. © 2016 Mesosphere, Inc. All Rights Reserved. 78 Computers
  79. 79. © 2016 Mesosphere, Inc. All Rights Reserved. 79 Constant Churn
  80. 80. © 2016 Mesosphere, Inc. All Rights Reserved. 80 Sometimes They Break
  81. 81. © 2016 Mesosphere, Inc. All Rights Reserved. 81 Sometimes Many Break
  82. 82. © 2016 Mesosphere, Inc. All Rights Reserved. 82 Sometimes You don’t know
  83. 83. © 2016 Mesosphere, Inc. All Rights Reserved. HOW DO YOU DO SIGNALING? 83
  84. 84. © 2016 Mesosphere, Inc. All Rights Reserved. 84 Nobody ever got fired for using Zookeeper
  85. 85. © 2016 Mesosphere, Inc. All Rights Reserved. 85 We know about Zookeeper
  86. 86. © 2016 Mesosphere, Inc. All Rights Reserved. 86 It works…usually
  87. 87. © 2016 Mesosphere, Inc. All Rights Reserved. 87 How else can we do this?
  88. 88. © 2016 Mesosphere, Inc. All Rights Reserved. 88 Naively (See: Distributed Erlang)
  89. 89. © 2016 Mesosphere, Inc. All Rights Reserved. 89 Massive Amount of Information
  90. 90. © 2016 Mesosphere, Inc. All Rights Reserved. 90 Who else has solved this?
  91. 91. © 2016 Mesosphere, Inc. All Rights Reserved. 91 Academia
  92. 92. © 2016 Mesosphere, Inc. All Rights Reserved. 92 Control Plane: Lashup
  93. 93. © 2016 Mesosphere, Inc. All Rights Reserved. 93 How do we scale the naive approach?
  94. 94. © 2016 Mesosphere, Inc. All Rights Reserved. 94 All we need is a sparse, connected graph
  95. 95. © 2016 Mesosphere, Inc. All Rights Reserved. 95 But how?
  96. 96. © 2016 Mesosphere, Inc. All Rights Reserved. 96 Enter: HyParView Builds a connected graph (overlay), where the degree is <=K
  97. 97. © 2016 Mesosphere, Inc. All Rights Reserved. 97 Boot Time
  98. 98. © 2016 Mesosphere, Inc. All Rights Reserved. 98
  99. 99. © 2016 Mesosphere, Inc. All Rights Reserved. 99
  100. 100. © 2016 Mesosphere, Inc. All Rights Reserved. 100
  101. 101. © 2016 Mesosphere, Inc. All Rights Reserved. 101
  102. 102. © 2016 Mesosphere, Inc. All Rights Reserved. 102
  103. 103. © 2016 Mesosphere, Inc. All Rights Reserved. 103
  104. 104. © 2016 Mesosphere, Inc. All Rights Reserved. 104
  105. 105. © 2016 Mesosphere, Inc. All Rights Reserved. 105
  106. 106. © 2016 Mesosphere, Inc. All Rights Reserved. 106
  107. 107. © 2016 Mesosphere, Inc. All Rights Reserved. 107
  108. 108. © 2016 Mesosphere, Inc. All Rights Reserved. 108
  109. 109. © 2016 Mesosphere, Inc. All Rights Reserved. 109
  110. 110. © 2016 Mesosphere, Inc. All Rights Reserved. 110
  111. 111. © 2016 Mesosphere, Inc. All Rights Reserved. 111 Connected Graph Hyparview
  112. 112. © 2016 Mesosphere, Inc. All Rights Reserved. 112 Constant Adaptive Health Checks* *Borrowed from SWIM, Gossip Style Failure Detector
  113. 113. © 2016 Mesosphere, Inc. All Rights Reserved. 113 Dealing with Failure Hyparview
  114. 114. © 2016 Mesosphere, Inc. All Rights Reserved. 114 Dealing with Failure Hyparview
  115. 115. © 2016 Mesosphere, Inc. All Rights Reserved. 115 Dealing with Failure Hyparview
  116. 116. © 2016 Mesosphere, Inc. All Rights Reserved. 116 Dealing with Failure Hyparview
  117. 117. © 2016 Mesosphere, Inc. All Rights Reserved. 117 Routing* *Borrowed from Dijkstra, OSPF, IS-IS
  118. 118. © 2016 Mesosphere, Inc. All Rights Reserved. 118
  119. 119. © 2016 Mesosphere, Inc. All Rights Reserved. 119
  120. 120. © 2016 Mesosphere, Inc. All Rights Reserved. 120 Just Run Dijkstra* *BFS / DFS
  121. 121. © 2016 Mesosphere, Inc. All Rights Reserved. 121 Minimum Spanning Tree, From Sender A
  122. 122. © 2016 Mesosphere, Inc. All Rights Reserved. 122 A Wild Pokemon Appears!* *CRDT papers Database
  123. 123. © 2016 Mesosphere, Inc. All Rights Reserved. 123 CRDTS: Semilattices “Time”
  124. 124. © 2016 Mesosphere, Inc. All Rights Reserved. 124 CRDT KV Store
  125. 125. © 2016 Mesosphere, Inc. All Rights Reserved. 125 “Demo”
  126. 126. © 2016 Mesosphere, Inc. All Rights Reserved. 126 “Demo” (foo@3c075477e55e)10> nodes(). [baz@3c075477e55e,bar@3c075477e55e] (bar@3c075477e55e)8> nodes(). [foo@3c075477e55e,baz@3c075477e55e] (baz@3c075477e55e)7> nodes(). [foo@3c075477e55e,bar@3c075477e55e]
  127. 127. © 2016 Mesosphere, Inc. All Rights Reserved. 127 “Demo” (foo@3c075477e55e)9> lashup_kv:value([erlang_users]). [] (bar@3c075477e55e)9> lashup_kv:value([erlang_users]). [] (baz@3c075477e55e)9> lashup_kv:value([erlang_users]). []
  128. 128. © 2016 Mesosphere, Inc. All Rights Reserved. 128 “Demo” (baz@3c075477e55e)10> net_kernel:allow([foo@3c075477e55e]). ok (baz@3c075477e55e)10> net_kernel:disconnect('bar@3c075477e55e'). ok.
  129. 129. © 2016 Mesosphere, Inc. All Rights Reserved. 129 “Demo”
  130. 130. © 2016 Mesosphere, Inc. All Rights Reserved. 130 “Demo” (bar@3c075477e55e)11> lashup_kv:request_op([erlang_users], {update, [{update, {number, riak_dt_pncounter}, {increment, 5}}]}). {ok,[{{number,riak_dt_pncounter},5}]}
  131. 131. © 2016 Mesosphere, Inc. All Rights Reserved. 131 “Demo” (foo@3c075477e55e)12> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},5}] (bar@3c075477e55e)13> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},5}] (baz@3c075477e55e)15> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},5}]
  132. 132. © 2016 Mesosphere, Inc. All Rights Reserved. 132 “Demo” (foo@3c075477e55e)23> erlang:set_cookie(baz@3c075477e55e, fake). true (foo@3c075477e55e)24> erlang:set_cookie(bar@3c075477e55e, fake). true
  133. 133. © 2016 Mesosphere, Inc. All Rights Reserved. 133 “Demo”
  134. 134. © 2016 Mesosphere, Inc. All Rights Reserved. 134 “Demo” (foo@3c075477e55e)29> lashup_kv:request_op([erlang_users], {update, [{update, {number, riak_dt_pncounter}, {increment, 3}}]}). {ok,[{{number,riak_dt_pncounter},8}]} (bar@3c075477e55e)17> lashup_kv:request_op([erlang_users], {update, [{update, {number, riak_dt_pncounter}, {increment, 7}}]}). {ok,[{{number,riak_dt_pncounter},12}]} (baz@3c075477e55e)19> lashup_kv:request_op([erlang_users], {update, [{update, {number, riak_dt_pncounter}, {increment, 11}}]}). {ok,[{{number,riak_dt_pncounter},16}]}
  135. 135. © 2016 Mesosphere, Inc. All Rights Reserved. 135 “Demo”
  136. 136. © 2016 Mesosphere, Inc. All Rights Reserved. 136 “Demo” (foo@3c075477e55e)1> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},26}] (bar@3c075477e55e)25> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},26}] (baz@3c075477e55e)23> lashup_kv:value([erlang_users]). [{{number,riak_dt_pncounter},26}]
  137. 137. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP KV 137 •Datatypes: •Maps •Composable •Sets •Flags •Counters •Registers •Last-write-wins
  138. 138. © 2016 Mesosphere, Inc. All Rights Reserved. 138 Does Lashup provide Business Value?
  139. 139. © 2016 Mesosphere, Inc. All Rights Reserved. 139 Prior to Lashup
  140. 140. © 2016 Mesosphere, Inc. All Rights Reserved. 140 After Lashup
  141. 141. © 2016 Mesosphere, Inc. All Rights Reserved. PROJECT LASHUP 141 •A novel distributed systems SDK that provides: •Membership •Multicast Delivery •Strongly-eventually consistent data storage •Powers: •Minuteman VIP dissemination •Minuteman node liveness checks •Overlay routing •DNS Synchronization •Open source: github.com/dcos/lashup
  142. 142. © 2016 Mesosphere, Inc. All Rights Reserved. 142 So, Why Erlang/OTP?
  143. 143. © 2016 Mesosphere, Inc. All Rights Reserved. 143 Why Not? (Go)
  144. 144. © 2016 Mesosphere, Inc. All Rights Reserved. 144 The First Version of Minuteman Was In Golang
  145. 145. © 2016 Mesosphere, Inc. All Rights Reserved. 145 Stand on the Shoulders of Giants
  146. 146. © 2016 Mesosphere, Inc. All Rights Reserved. 146 What’s a good language with a healthy set of distributed systems, and networking libraries?
  147. 147. © 2016 Mesosphere, Inc. All Rights Reserved. 147 What’s a good language with a healthy set of distributed systems, and networking libraries? With an ecosystem
  148. 148. © 2016 Mesosphere, Inc. All Rights Reserved. 148 Erlang
  149. 149. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP KV NEEDED A DURABLE STORE ANSWER: MNESIA 149 •ACID, Transactional •20+ years old •Battle Tested •Comes built-in •Software Transaction Memory
  150. 150. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP KV NEEDED A CRDT LIBRARY ANSWER: RIAK_DT 150 •Riak’s CRDT Library •Used in production by others •Battle Tested •Property tested
  151. 151. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP NEEDED SECURE DISTRIBUTION ANSWER: DISTERL 151 •SSL with mutual authentication is a matter of configuration •Used in production by others •Battle Tested •Comes Built-in
  152. 152. © 2016 Mesosphere, Inc. All Rights Reserved. SPARTAN NEEDED A DNS SERVER ANSWER: ERL-DNS 152 •Runs DNSimple •Used in production •Battle Tested •Some Statistics from DNSimple •Serves 20K+ domains •Does 20k queries/sec+
  153. 153. © 2016 Mesosphere, Inc. All Rights Reserved. MINUTEMAN NEEDED A KERNEL API LIBRARY ANSWER: GEN_NETLINK 153 •Liberal license •Deployed by TravelPing •Used in production •Battle Tested
  154. 154. © 2016 Mesosphere, Inc. All Rights Reserved. LASHUP NEEDED A GRAPH LIBRARY ANSWER: DIGRAPH 154 •Built-in Library to Erlang/OTP •Correctness verified •Simple API •Downsides: •Slow •Heavy
  155. 155. © 2016 Mesosphere, Inc. All Rights Reserved. 155 So, we built our own.
  156. 156. © 2016 Mesosphere, Inc. All Rights Reserved. 156 On Testing…
  157. 157. © 2016 Mesosphere, Inc. All Rights Reserved. 157 We verified lashup_gm_route against digraph
  158. 158. © 2016 Mesosphere, Inc. All Rights Reserved. 158 Property-Based Testing: PropEr
  159. 159. © 2016 Mesosphere, Inc. All Rights Reserved. 159 Automatically Generate Command Set CSet ={add_node(e), add_node(b), add_edge(e,b), …}
  160. 160. © 2016 Mesosphere, Inc. All Rights Reserved. 160 Verify Equality Digraph lashup_gm add_node(e) add_node(e) Check Equivalency add_node(b) add_node(b) Check Equivalency add_edge(e, b) add_edge(e, b) Check Equivalency
  161. 161. © 2016 Mesosphere, Inc. All Rights Reserved. 161 …Tens of Thousands of Times
  162. 162. © 2016 Mesosphere, Inc. All Rights Reserved. 162 Common Test Makes Integration Testing A Breeze
  163. 163. © 2016 Mesosphere, Inc. All Rights Reserved. COMMON TEST: ON EACH CHECK-IN (LASHUP) 163 •Spins up 25 virtual nodes •Torture tests: •Partitions nodes randomly for 5 minutes to see if it can generate a permanent split •Kills nodes randomly to verify global health check convergence •Verifies multicast works even in poor network conditions •Writes divergent keys to all nodes, and reconnects them •~500 lines of tests, 80%+ code coverage
  164. 164. © 2016 Mesosphere, Inc. All Rights Reserved. TAKE-AWAYS 164 •The container ecosystem is in is early stages •Erlang is suitable for building modern, reliable distributed systems •These systems are tricky, so testing is key •We have a healthy ecosystem •There need to be alternatives to consensus
  165. 165. © 2016 Mesosphere, Inc. All Rights Reserved. GET IN TOUCH IF YOU’RE INTERESTED (P.S. WE’RE HIRING) 165
  166. 166. © 2016 Mesosphere, Inc. All Rights Reserved. ERLANG & CONTAINER NETWORKING @SARGUN 166

×