O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Dock ir incident response in a containerized, immutable, continually deployed environment

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 17 Anúncio

Dock ir incident response in a containerized, immutable, continually deployed environment

Baixar para ler offline

Incident response is generally predicated on the ability to examine a system post-breach, pull memory dumps, file system artifacts, system logs, etc. But what happens when that system was part of a fleet of containers? How do you pull a memory dump from an ephemeral container? How do you do forensics when the container and the host that ran the container have been gone for days? Even assuming you catch an intrusion while it's ongoing, how do you respond effectively if you can't access the systems in question because they are read-only, no SSH access? Coinbase has spent the last year attacking these challenges in a AWS-based, immutable and fully containerized infrastructure that stores over a billion dollars of digital currency. Come see how we do it.

Incident response is generally predicated on the ability to examine a system post-breach, pull memory dumps, file system artifacts, system logs, etc. But what happens when that system was part of a fleet of containers? How do you pull a memory dump from an ephemeral container? How do you do forensics when the container and the host that ran the container have been gone for days? Even assuming you catch an intrusion while it's ongoing, how do you respond effectively if you can't access the systems in question because they are read-only, no SSH access? Coinbase has spent the last year attacking these challenges in a AWS-based, immutable and fully containerized infrastructure that stores over a billion dollars of digital currency. Come see how we do it.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a Dock ir incident response in a containerized, immutable, continually deployed environment (20)

Anúncio

Mais de Shakacon (20)

Mais recentes (20)

Anúncio

Dock ir incident response in a containerized, immutable, continually deployed environment

  1. 1. DockIR: Incident Response in a Containerized, Immutable, Continually Deployed Environment
  2. 2. DockIR: Incident Response in a Containerized, Immutable, Continually Deployed Environment
  3. 3. Who Am I Security guy at Coinbase We protect $5+ Billion in Digital Assets (bitcoin, ethereum, litecoin) on a 100% container-based, AWS, continually deployed infra Long time container enthusiast Occasional open source software dev https://github.com/Phillipmartin
  4. 4. What we're not covering Secure Configuration of Docker This is a deep complex topic, there are some great resource online including a CIS benchmark (https://www.cisecurity.org/cis-benchmarks/) AWS Security Lots of great talks about this SDLC impact of Containers and CD This deserves it’s own talk (and there are several good ones) Secure deployment concepts
  5. 5. What we are covering Some Coinbase context Preparation - things that we do because containers make it necessary or because containers make it easier/possible Detection - the different things we can or must do to detect evil in our environment Response - things that we can do once we do detect evil
  6. 6. Why should we care? Docker CI/CD
  7. 7. Context
  8. 8. Preparation Challenge: System and Software inventories are on essentially every list of critical security controls everywhere. How do we gain visibility into the container OS and software, and how we can reconstruct historical inventory when it changes so rapidly? Response: Establish managed base container OSs Only allow deployment of whitelisted containers Use Claire to scan *and log* container packages in the CD pipeline Controls on Dockerfile contents (e.g. don’t allow RUN curl > filesystem)
  9. 9. Preparation Opportunity: Containers should be single purpose and don’t generally need a full userspace environment or the ability to call all syscalls. Response: Make the managed base OSs as minimal as possible (Alpine is great for this) If you can, abstract this away from developers entirely We actually use scratch containers for some Go services Docker already has sane defaults for capabilities and seccomp, don’t turn them off Because we follow an immutable deployment concept, we can actually deploy many containers fully read-only
  10. 10. Detection Challenge: Containers, by design, should be a single purpose environment. That means no room for agent-based security solutions. How do we get telemetry? Response: There are a number of vendors out there that answer this question in a bunch of different ways. I’m not going to talk about any of them We answer this question using a combo of surveillance from the host OS based on auditd (and one day soon eBPF!), sidecar containers that inspect specific things (e.g. DNS logging, rolling pcap, etc) and very verbose application logs All of this is shoved into a Kinesis stream and sucked into our log pipeline
  11. 11. Detection Opportunity: Containers, by design, should be a single purpose environment. Does that mean we can do real whitelisting? Response: Mostly. Whitelisting exec calls per process/container Whitelisting connect calls per process/container
  12. 12. Response Challenge: If attacker dwell time is measured in weeks or months and container lifetimes are measured in hours or days, how do you effectively investigate the full scope of a breach? Response: This is one of the core problems, to me, in an environment with broad adoption of CD. Log everything (audit, docker logs, system logs, etc) Enrich log lines when they are logged (e.g. IP 10.2.3.4 may be hosting app A today, but app B by the time a breach is detected) Keep logs for years (even if they fall into cold storage after a while) For highly vulnerable or critical services, consider saving some filesystem or memory artifacts
  13. 13. Response Challenge: How do I isolate a container for forensics? Response: At the AWS level this is fairly well understood. Changing the security group for an instance is possible to do on the fly and an effective way to ensure that the potentially compromised host can only talk to IR tooling. At the host level there are a few options: docker pause will pause all processes in the container. This can be useful if you need to mitigate an incident but can’t investigate this host right now Network isolation using docker network disconnect (unless you are using --network=host) Network isolation using iptables and the DOCKER-USER chain (docker 17+)
  14. 14. Response Challenge: How can I do live response on a container? Response: Standard docker commands provide a lot of insight (more in a sec) inspect, diff, cp, export, pause, exec But fundamentally, docker containers are reflected on the host as some processes with specific restrictions, so most of your normal tools will mostly work (e.g. strace, gdb, etc) You can even grab process memory directly using /proc/$pid/mem and /proc/$pid/maps or something like gcore from the host A full memory image of the host will also capture the running processes, but a bunch of details will be wacky because of namespaces (e.g. paths, PIDs, UIDs, maybe IPs, hostnames, etc).
  15. 15. Response Challenge: How can I do live response on a container? Response: Standard docker commands provide a lot of insight docker inspect - dump container metadata docker diff - diff a running container against the base image docker cp - move files in and out of a running container from the host docker export - create a tar of the current state of a running instance’s filesystem docker pause - pause the target instance (using cgoup freezer)
  16. 16. Response Challenge: How do I respond if I have no access? Response: Automate your response actions, make it a single script that auto-fetches dependencies It’s actually OK if this is loud in your logging/monitoring. You probably should alert when someone loads a new kernel module for memory dumping Have a process poll SQS (or whatever) for a signal that it should kick off a response I strongly suggest you follow the GRR model and make sure the commands in that channel are signed and authorized keys are hardcoded in the response script Once your response runs, tar it up and encrypt it back to the key that signed to the command (or some central key or whatever works for your setup) and upload it to an S3 bucket
  17. 17. Questions?

Notas do Editor

  • 1
  • 1
  • 3

    Define containerized, immutable, CD

    Take a quick poll:

    Who has some experience with containers?
    Who has used or dealt with containers in production at scale?

    Immutable

    CD

  • 3
  • 5

    We’re talking about what we do in the context of Coinbase, so a brief intro to the environment is in order. (If you want a deeper look our infra team talks and writes about this stuff a lot.)

    Codeflow
    Geoengineer
    Deployment concepts
    Developer contract



  • 5

    Alpha support in Kub for whitelisted containers/sources
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5
  • 5

×