4. @tmclaughbos
Actually my job is to…
• Understand problems; and explain how to solve them
• Experiment with tools and demonstrate how to use them
• Freedom to fail
• I maintain AWS infrastructure for Threat Stack Sales and Marketing
• Keeps me sharp
• Gives me ideas
Here to talk about incorporating security best practices into Terraform config
Security isn’t something you just sprinkle on after the fact
It’s easier to start thinking of this in the beginning
Doing it afterwards, you’re going to find it becomes much harder.
You’ve made decisions that maybe paint yourself into a corner.
Think VPCs, subnets, etc. which we’ll see.
Who is this sewer clown?
I happen to love this meme so I needed a way to work it in
I’ve spent most of my career in ops
It grew a little stale for me
I wanted a new thing to solve
Solving people problems is hard
So now I’m an engineering advocate!
<tweet>
I, and other folks like me, like to joke that this is my job.
<slide>
* Solving people problems is much harder it more rewarding to me than scaling a database service.
Most technical arguments are subjective
Two opposing ideas can both be correct
they usually reflect a difference values, assumptions, and goals.
What I say may not be for you.
Let’s dive in
* I do not solve Facebook, Google, or Netflix scale problems
* Because I do not work for companies of those scales
* I solve for not dying
I embrace tech debt
* I hope my tech provides a solid basis for the future
* I accept my work will be thrown out
The black was is AMAZING!
* I drove it maybe 5 times last summer.
* it has spent a lof of time at the mechanic being perfected.
* It’s only usable 5-6months of the year
I see the job of ops as to provide guardrails
Not gates to slow people down and exact a toll
I’m here to keep you from flying off the cliff
I build tools and processes to do that
You opperating your own systems enabled me to build these guardrails.
I want to turn my domain knowledge into code.
* I just like this meme and I think it’s true.
We’re still in that phase of a Terraform’s lifetime where we’re figuring out how to structure and use this tool correctly.
Think of maybe the orly days of Puppet before “roles and profiles”
This is today a majority right way to do things.
Most of us using Terraform have already dealt with “How do I keep my different environments together but only affect one when I run Terraform.”
per environment directories + symlinks usually.
I experiment with ideas
Sometimes I fail
I have only gotten so far
No existing ideas for structuring Terraform really compelled me.
Our config between environments should match.
Otherwise we’re introducing variability where none should exist.
“It works fine in dev. Oh, prod is just ‘different’”.
We boil down differences to data.
sizes; ex. ASG
IP address ranges for subnets
Pros
reduced variability
Cons
time consuming
My IAM management is awful.
Mixed type vales, array of maps, maps with array values, are not possible
Everything gets its own repo.
Not only do we only manage one environment at a time, we manage one service at a time
This would let me restrict what parts of the environment people have access to.
Every service has its own state file
I am not sure how well this idea will scale in reality
I write a lot of modules
this is just a subset; some internal modules unreleased
pros
I provide modules that do the heavy lifting for people
I want people to need as little domain knowledge as possible.
You don’t need to know how to setup S3 bucket access logging.
cons
time consuming
This REALLY slows down initial gains
I hope the module community becomes stronger
I wish you could auto output all resource outputs with a module.
So let’s dive in now
Single environment per AWS account
Prod in one account
Dev in another
Etc
“How do I run Terraform and NOT apply changes to all environments?”
This is one of the most common issues I found new people look into once they decide to really start managing their infrastructure with Terraform.
I don’t want to make my dev environment changes immediately to Prod!
<slide>
* I find this pattern very common
* Point our company-* folders
* Hated everything I saw
but I need so solve it immediately
why?
<slide>
We separate environments by AWS account
Users do not cross accounts
I can not alter more than one environment at once
Unless I put credentials in my TF config
I only keep one environment’s credentials in my shell environment at a time.
I don’t have Jenkins doing this work yet which might alter some assumptions
Why do I do this?
why do I go through all this trouble?
lateral movement
you've got sales engineers building and deploying into AWS because they need to demonstrate product features.
I have yet to see a demo environment that’s as well run as a product environment
Demo environments don’t generate as much money as the product environment
Attackers often don’t initially breach the most valuable target.
They start with what’s available and once inside they figure movement is easier and they hop around where possible.
So here’s how I approach this
See the .tfvars files. They reflect different AWS accounts.
Our environments are just data
the configuration is the same across environments
all that differs is the data that drives the configuration
What differs between a production and dev autoscaling group? Size?
If you look at this part of our infrastructure it probably doesn’t differ much beyond
AWS profile and account name
S3 bucket name for remote state
Creating a new environment is just creating a new .tfvars file
variability across environments is bad IMHO.
It makes it more likely that you’ll make a mistake
“Why didn’t this work in dev? It’s just different/”
<slide>
I’m used to a single VPC and everybody talks to everybody.
Who else is used to that?
This is something I did not want to repeat.
Wasn’t fully ready for nothing but explicitly allowed movement.
Networking is a lost skill for me and I know others in the same boat who are in the cloud world
This is a rough diagram of hat my environment looks like
Each AZ has two subnets, a public and a private.
You need to be internet accessible?
Great!…
But you can’t talk by default to anything that doesn’t need internet access.
I wasn’t yet ready to say all traffic needs to be completely trolled
I’m just not sure how that idea would scale and it wasn’t necessary just yet for me.
This is better than previous environments I’d been in
This is what my VPC looks like at a high level
feed it a list of subnet ranges for public and private subnets.
Should this VPC have an internet gateway?
Should the private hosts be NAT enabled so they can communicate out?
I do have a bool to allow or deny intra-{public,private} traffic
I want a service owner to think about their service
<point to subnet type>
I want them to say “I know this service is public facing!”
I want them to acknowledge that and hopefully think about the responsibility
If we go into my service module…
Look at vpc_zone_identifier
we do a lookup to the remote state and say
Give me the subnets that are private
or give me the subnets that are public.
and that’s where the service will live
Things like this make me LOVE remote state.
Bastion hosts!
* This is the host where all login traffic must start
Not everything needs SSH exposed.
You stream data from customer hosts up to your platform?
No one needs to SSH in directly to that host.
They should be going through your bastion host
That’s your front door
harden it.
Make sure it has 2FA enabled
Maybe put that host on a faster patching cycle
<slide>
More reasons where I love remote states!
I create a security group in my AWS VPC module.
Then when I deploy my bastion host it adds a rule allowing traffic from my bastion to the rest of my environment
THAT IS AWESOME!
Let’s apply this further!
I deploy an app that needs to talk to a DB…
My service will query it’s dependency service for the SG ID
and then my service will add a rule granting access
I see myself heading in that direction of only allowed service to service communication
That was awful to manage before
I spend much of my time writing modules.
Back to my puppet days
You didn’ use the nginx module directly
You used my classes that set sand defaults fr the hard parts
you just included `company::nginx`
<slide>
I see this my job more than to operate systems at scale
You need a bucket to store some data in.
I’ll take care of the rest
<slide>
* this is where my expertise comes in
Buckets default to private
People make this mistake
And they suffer data loss, breaches, etc
module defaults to access logging
We expose access logging as a simple bool in S3 module
versioning defaults to false… But it’s there.