CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

Configuration Management
In the cloud-native era

Shahar Mintz
Married to Rachel
Father for Ariel and Yama
DevOps Consultant/Freelance
Previously @ Wix, Packet,
Facebook, Onavo, Red Hat,
Qumranet

2013 -> 2018
2013 - Onavo
~400 physical servers
Puppet in charge of:
- OS configuration (sysctl,
services, files etc)
- Monitoring
- Machine Access
- Firewall configuration
- Service Discovery
2018 - Packet
~ 30 installations
- OS Config:
- Salt
- Provisioning:
- Terraform
- Monitoring:
- Prometheus & Grafana (Manual)
- Service Discovery:
- Consul
- Workload Management:
- Nomad

Why configuration
management is
important?

A brief history of
configuration
management

1st gen: CFEngine - 1993
State:
Server management done by
scripts.
Challenges:
Different OS flavours
Machines in different states

CFEngine
Solution:
Declarative DSL.
Focus on Promises vs
Obliges.
bundle agent example {
files: "/tmp/testfile"
create => "true",
edit_line => proper_greetings;
}
bundle edit_line proper_greetings {
delete_lines: ".*";
insert_lines: "Hello World!";
}

CFEngine - The bad things
Written in C - Hard to
extend.
Limited file content
operations.
bundle agent example {
files: "/tmp/testfile"
create => "true",
edit_line => proper_greetings;
}
bundle edit_line proper_greetings {
delete_lines: ".*";
insert_lines: "Hello World!";
}

2nd Generation
Puppet
- 1st Release: 2004
- DSL: Proprietary
- File content: ERB
Chef
- 1st Release: 2009 (was used
before)
- DSL: Ruby
- File content: ERB

Model -> Controller -> View
Model Controller View
Chef attributes Recipes (ruby) ERB
Puppet variables modules ERB

2nd Generation
- Better server architecture
- Database backend
- Machine attribute/variables (facter/ohai)

Chef & Puppet disadvantages
- No parallelism
- Long execution times

3rd Gen
- Push based execution
- Focus on remote parallel
execution
- Asset management
Salt:
- First release: 2011
Ansible:
- First release: 2012

Containers & Cloud Native applications

Cloud Native Infrastructure
Provisioning and SaaS/IaaS:
- Terraform
- Pulumi
- CloudFormation
Workload Management
- Kubernetes
- Helm
- Kustomize
- Nomad
Monitoring:
- Prometheus
- InfluxDB
- Grafana
Networking/Routing:
- Envoy
- Linkerd

Configuration Management challenges
1. Configuration evaluated on the production machines
2. Hard to test (result of problem #1)
3. Too many configuration formats
4. YAML+Templating = 💔

Software wants:
Static Data (i.e JSON)
{
"person1": {
"name": "Alice",
"welcome": "Hello Alice!"
},
"person2": {
"name": "Bob",
"welcome": "Hello Bob!"
}
}
Software vs. Human
Human writes:
Python
def person(name='Alice'):
return {
“name”: name,
“welcome”: 'Hello %s!' % name,
}
def main():
return {
“person1”: person(),
“person2”: person('Bob'),
}

“Because our systems are ultimately managed by humans, humans
are responsible for configuration. The quality of the human-
computer interface of a system’s configuration impacts an
organization’s ability to run that system reliably.”
- Štěpán Davidovič, Google SRE Workbook
https://sre.google/workbook/configuration-design/

Protoconf Goals
- Deliver configs to all clusters in seconds,
not minutes.
- Configs should have schemas with type
safety
- Configs should be coded, then materialized
- Config changes should be reviewed
- Configs should be easy to test and validate
- Configs could be consumed by all popular
languages.
- Both humans and machines should be able to
change configs

Define the config schema
The developer will define
the config struct in
protobuf
// file: ./src/myproject/myconfig.proto
syntax = "proto3";
message MyConfig {
uint32 connection_timeout = 1;
uint32 max_retries = 2;
NestedStruct another_struct = 3;
}
message NestedStruct {
string hello_world = 1;
}
https://developers.google.com/protocol-buffers
https://docs.protoconf.sh/getting-started/

Add validations
"""
file: ./src/myproject/myconfig.proto-validator
"""
load("myconfig.proto", "MyConfig")
def validate_connection_timeout(config):
if config.connection_timeout <= 3:
fail("connection_timeout must be 3 or higher, got: %d" %
config.connection_timeout)
add_validator(MyConfig, validate_connection_timeout)

Code your config
The developer will then
create a `.pconf` file to
populate the config struct
with the required values.
"""
file: ./src/myproject/myconfig.pconf
"""
load("myconfig.proto", "MyConfig", "NestedStruct")
def main():
return MyConfig(
connection_timeout=5,
max_retries=5,
another_struct=NestedStruct(
hello_world="Hello World!"
)
)

Compile
$ protoconf compile .
{
"protoFile": "myproject/myconfig.proto",
"value": {
"@type": "type.googleapis.com/MyConfig",
"connectionTimeout": 5,
"maxRetries": 5,
"anotherStruct": {
"helloWorld": "Hello World!"
}
}
}

Consume
channel = grpc.insecure_channel("localhost:4300")
stub = ProtoconfServiceStub(channel)
config = MyConfig()
for update in stub.SubscribeForConfig(
ConfigSubscriptionRequest(path="myproject/myconfig")
):
update.value.Unpack(config)
print(config)
https://www.grpc.io

Mutation RPC
https://docs.protoconf.sh/mutation-rpc/

Learn More & Contribute
- Docs site:
- https://docs.protoconf.sh
- Star us on Github
- protoconf/protoconf
- Join us on Discord
- https://discord.protoconf.sh
- Follow us on Twitter:
- @protoconfdev

CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

Similar to CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack (20)

More from DevOpsDays Tel Aviv

More from DevOpsDays Tel Aviv (20)

Recently uploaded

Recently uploaded (20)

CONFIGURATION MANAGEMENT IN THE CLOUD NATIVE ERA, SHAHAR MINTZ, EggPack

Editor's Notes