The maturation of containerization platforms has changed how people think about creating development environments and has eliminated many inefficiencies for deploying applications. These concept and technologies have made its way into the PostgreSQL ecosystem as well, and tools such as Docker and Kubernetes have enabled teams to run their own “database-as-a-service” on the infrastructure of their choosing.
In this talk, we will cover the following:
- Why containers are important and what they mean for PostgreSQL
- Setting up and managing a PostgreSQL along with pgadmin4 and monitoring
- Running PostgreSQL on Kubernetes with a Demo
- Trends in the container world and how it will affect PostgreSQL
Dev Dives: Streamline document processing with UiPath Studio Web
Using PostgreSQL With Docker & Kubernetes - July 2018
1. An Introduction to Using PostgreSQL
with Docker & Kubernetes
JONATHAN S. KATZ
JULY 19, 2018
LOS ANGELES POSTGRESQL USER GROUP
2. About Crunchy Data
2
• Leading provider of trusted open source PostgreSQL and
PostgreSQL related technologies, support, and training
to enterprises
• We're hiring!
• crunchydata.com
• @crunchydb
3. • Director of Communications, Crunchy Data
• Previously: Engineering leadership in startups
• Longtime PostgreSQL community contributor
• Advocacy & various committees for PGDG
• @postgresql + .org content
• Director, PgUS
• Co-Organizer, NYCPUG
• Conference organization + speaking
• @jkatz05
About Me
3
4. • Containers: A Brief History
• Containers + PostgreSQL
• Setting up PostgreSQL with Containers
• Deploying! - Container Orchestration
• Look Ahead: Trends in the Container World
Outline
4
5. • Containers are processes that
encapsulate all the requirements to
execute an application
• Similar to virtual machines,
Sandbox for applications similar to
a virtual machine but with
increased density on a single host
What Are Containers?
5
Source: Docker
6. • Container Image - the file that describes how to build a container
• Container Engine - prepares for container to be executed by container
runtime by collecting container images, accepting user input, preparing
mount points, etc. Examples: docker, CRI-O, RKT, LXD
• Container Runtime - Takes information passed from container engine and
sets up containerized process. Open Containers Initiative (OCI) helping to
standardize on runc
• Container - The runtime instantiation of a Container Image, i.e. a process!
Container Glossary
6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
7. • Lightweight
• compared to virtual machines, use less disk, RAM, CPU
• Sandboxed
• Container runtime is isolated from other processes
• Portability
• Containers can be run on different platforms as long as container engine is available
• Convenience
• Requirements for running applications bundled together
• Prevents messy dependency overlaps
Why Containers?
7
11. • Containers provide several advantages to running PostgreSQL:
• Setup & distribution for developer environments
• Ease of packaging extensions & minor upgrades
• Separate out secondary applications (monitoring, administration)
• Automation and scale for provisioning and creating replicas, backups
Containers & PostgreSQL
11
12. • Containers also introduce several challenges:
• Administrator needs to understand and select appropriate storage
options
• Configuration for individual database specifications and user access
• Managing 100s - 1000s of containers requires appropriate
orchestration (more on that later)
• Still a database within the container; standard DBA tuning applies
• However, these are challenges you will find in most database environments
Containers & PostgreSQL
12
13. • We will use the Crunchy Container Suite
• PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension
• pgpool + pgbouncer: connection pooling, load balancing
• pgbackrest: terabyte-scale backup management
• Monitoring: Prometheus + export
• Scheduling: "crunchy-dba"
• pgadmin4: UX-driven management
• Open source!
• Apache 2.0 license
• Support for Docker 1.12+, Kubernetes 1.5+
• Actively maintained and updated
Getting Started With Containers & PostgreSQL
13
https://github.com/CrunchyData/crunchy-containers
17. Demo: Adding Monitoring
17
cat << EOF > collect-env.list
DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable
EOF
docker run
--env-file=collect-env.list
--network=pgnetwork
--name=collect
--hostname=collect
--detach crunchydata/crunchy-collect:centos7-10.4-2.0.0
docker volume create --driver local --name=prometheus
cat << EOF > prometheus-env.list
COLLECT_HOST=collect
SCRAPE_INTERVAL=5s
SCRAPE_TIMEOUT=5s
EOF
docker run
--publish 9090:9090
--env-file=prometheus-env.list
--volume prometheus:/data
--network=pgnetwork
--name=prometheus
--hostname=prometheus
--detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0
docker volume create --driver local --name=grafana
cat << EOF > grafana-env.list
ADMIN_USER=jkatz
ADMIN_PASS=password
PROM_HOST=prometheus
PROM_PORT=9090
EOF
docker run
--publish 3000:3000
--env-file=grafana-env.list
--volume grafana:/data
--network=pgnetwork
--name=grafana
--hostname=grafana
--detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0
1. Set up the metric collector
2. Set up prometheus to store metrics 3. Set up grafana to visualize
18. • Explored what / why / how of containers
• Set up a PostgreSQL 10 instance
• Set up pgadmin4 to manage our PostgreSQL instance
• Set up monitoring to analyze performance of our system
• Of course, the next question naturally is:
Recap
18
20. • "Open-source system for
automating deployment, scaling,
and management of containerized
applications."
• Manage the full lifecycle of a
container
• Assists with scheduling, scaling,
failover, high-availability, and more
Kubernetes: Container Orchestration
20 Source: https://kubernetes.io
21. • Value of Kubernetes increases
exponentially as number of containers
increases
• Due to statefulness of databases,
Kubernetes requires more knowledge to
successfully operate a standard database
workload:
• Avoid scheduling and availability issues
for longer-running database containers
• Data continues to exist even if
container does not
When to Use Kubernetes
21
22. • Node: A Kubernetes "worker" machine that is able to run pods
• Pod: One or more running containers; the "atomic" unit of Kubernetes
• Service: The access point to a set of Pods
• ReplicaSet: Ensures that a specified number of replica Pods are running at a given time
• Deployment: A controller that ensures all running Pods / ReplicaSets match the desired
state of the execution environment (total number of pods, resources, etc.)
• Persistent Volume (PV): A storage API that enables information to persist after a Pod has
terminated
• Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes
information such as amount of storage. Used for dynamic provisioning
Kubernetes Glossary Important for PostgreSQL
22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
23. • Kubernetes provide the gateway to run your own "database-as-a-service:"
• Mass apply databases commands:
• Updates
• Backups / Restores
• ACL rule changes
• Scale up / down replicas
• Failover
23
PostgreSQL in a Kubernetes World
24. • Kubernetes is "turnkey" for stateless applications
• e.g. web servers
• Databases do maintain state: permanent storage
• Persistent Volumes (PV)
• Persistent Volume Claims (PVC)
PostgreSQL in a Kubernetes World
24
25. • Utilizes Operator framework initially launched by CoreOS to
help capture nuances of managing complex applications that
maintain state, e.g. databases
• Allows an administrator to run PostgreSQL-specific
commands to manage database clusters, including:
• Creating / Deleting a cluster (your own DBaaS)
• Scaling up / down replicas
• Failover
• Apply user policies to PostgreSQL instances
• Define what container resources to use (RAM, CPU, etc.)
• Smart pod deployments to nodes
• REST API
Crunchy PostgreSQL Operator
25
https://github.com/CrunchyData/postgres-operator
26. • Automation: Complex, multi-step DBA tasks
reduced to one-line commands
• Standardization: Many customizations, same
workflow
• Ease-of-Use: Simple CLI; UI in beta
• Scale
• Provision & manage clusters quickly
amongst thousands of instances
• Load balancing, disaster recovery, security
policies, deployment specifications
• Security: Sandboxed environments, RBAC,
mass grant/revoke policies
Why Use An Operator With PostgreSQL?
26
30. • Containers are no longer "new" - orchestration technologies have matured
• Debate with containers + databases: storage & management
• No different than virtual machines + databases
• Databases are still databases: need expertise to manage
• Stateful Sets vs. Deployments
• Database deployment automation flexibility
• Deploy your architecture to any number of clouds
• Monitoring: A new frontier
Containerized PostgreSQL: Looking Ahead
30
31. • Containers + PostgreSQL gives you:
• Easy-to-setup development environments
• Your own production database-as-a-service
• Tools to automate management of over 1000s of instances in short-
order
Conclusion
31