SlideShare a Scribd company logo
1 of 72
Download to read offline
Sane Schema Management with
Alembic and SQLAlchemy
Selena Deckelmann
Mozilla
@selenamarie
chesnok.com
I work on Socorro.
http://github.com/mozilla/socorro
http://crash-stats.mozilla.com
Thanks and apologies to Mike Bayer
What's sane schema management?
Executing schema change in a controled,
repeatable way while working with
developers and operations.
What's alembic?
Alembic is a schema migration tool that
integrates with SQLAlchemy.
My assumptions:
● Schema migrations are frequent.
● Automated schema migration is a goal.
● Stage environment is enough like
production for testing.
● Writing a small amount of code is ok.
No tool is perfect.
DBAs should drive migration tool
choice.
Chose a tool that your developers like.
Or, don't hate.
Part 0: #dbaproblems
Part 1: Why we should work with developers
on migrations
Part 2: Picking the right migration tool
Part 3: Using Alembic
Part 4: Lessons Learned
Part 5: Things Alembic could learn
Part 0: #dbaproblems
Migrations are hard.
And messy.
And necessary.
Changing a CHECK constraint
on 1000+ partitions.
http://tinyurl.com/q5cjh45
What sucked about this:
● Wasn't the first time (see 2012 bugs)
● Change snuck into partitioning UDF
Jan-April 2013
● No useful audit trail
● Some partitions affected, not others
● Error dated back to 2010
● Wake up call to examine process!
Process before Alembic:
What was awesome:
● Used Alembic to manage the change
● Tested in stage
● Experimentation revealed which
partitions could be modified without
deadlocking
● Rolled out change with a regular release
during normal business hours
Process with Alembic:
1. Make changes to model.py or
raw_sql files
2. Run: alembic revision –-auto-generate
3. Edit revision file
4.Commit changes
5. Run migration on stage after
auto-deploy of a release
Process with Alembic:
1. Make changes to model.py or
raw_sql files
2. Run: alembic revision -–auto-generate
3. Edit revision file
4.Commit changes
5. Run migration on stage after
auto-deploy of a release
Problems Alembic solved:
● Easy-to-deploy migrations including
UDFs for dev and stage
● Can embed raw SQL, issue multi-
commit changes
● Includes downgrades
Problems Alembic solved:
● Enables database change discipline
● Enables code review discipline
● Revisions are decoupled from release
versions and branch commit order
Problems Alembic solved (continued):
● 100k+ lines of code removed
● No more post-deploy schema
checkins
● Enabling a tested, automated stage
deployment
● Separated schema definition from
version-specific configuration
Photo courtesy of secure.flickr.com/photos/lambj
HAPPY
AS A CAT IN A BOX
Part I: Why we should work
with developers on migrations
Credit: flickr.com/photos/chrisyarzab/
Schemas change.
Developers find this process really
frustrating.
Schemas, what are they good for?
Signal intent
Communicate ideal state of data
Highly customizable in Postgres
Schemas, what are they not so good for?
Rapid iteration
Documenting evolution
Major changes on big data
Data experimentation
Database systems resist change.
Database systems resist change because:
Exist at the center of multiple systems
Stability is a core competency
Schema often is the only API between
components
How do we make changes to schemas?
Because of resistance, we treat
schema change as a one-off.
Evolution of schema change process
We're in charge of picking up the pieces when
a poorly-executed schema change plan fails.
Trick question:
When is the right time to work with
developers on a schema change?
How do we safely make changes to schemas?
How do we safely make changes to schemas?
Process and tooling.
Preferably, that we choose and implement.
Migration tools are really
configuration management tools.
Migrations are for:
● Communicating change
● Communicating process
● Executing change in a controled,
repeatable way with developers and
operations
Part 2: Picking the right migration tool
Questions to ask:
● How often does your schema change?
● Can the migrations be run without you?
● Can you test a migration before you run
it in production?
Questions to ask:
● Can developers create a new schema
without your help?
● How hard is it to get from an old
schema to a new one using the tool?
● Are change rollbacks a standard use of
the tool?
What does our system need to do?
● Communicate change
● Apply changes in the correct order
● Apply a change only once
● Use raw SQL where needed
● Provide a single interface for change
● Rollback gracefully
How you are going to feel
about the next slide:
Use an ORM with the migration tool.
Shameful admission:
We had three different ways of defining
schema in our code and tests.
A good ORM provides:
● One source of truth about the schema
● Reusable components
● Database version independence
● Ability to use raw SQL
And good ORM stewardship:
● Fits with existing tooling and
developer workflows
● Enables partnership with developers
● Integrates with a testing framework
And:
● Gives you a new way to think about
schemas
● Develops compassion for how
horrible ORMs can be
● Gives you developer-friendly
vocabulary for discussing why ORM-
generated code is often terrible
Part 3: Using Alembic
Practical Guide to using Alembic
http://tinyurl.com/po4mal6
https://alembic.readthedocs.org
revision: a single migration
down_revision: previous migration
upgrade: apply 'upgrade' change
downgrade: apply 'downgrade' change
offline mode: emit raw SQL for a change
Installing and using:
virtualenv venv-alembic
. venv-alembic/bin/activate
pip install alembic
alembic init
vi alembic.ini
alembic revision -m “new”
alembic upgrade head
alembic downgrade -1
Defining a schema?
vi env.py
Add: import myproj.model
Helper functions?
Put your helper functions in a custom
library and add this to env.py:
import myproj.migrations
Ignore certain schemas or partitions?
In env.py:
def include_symbol(tablename, schema):
return schema in (None, "bixie") and
re.search(r'_d{8}$', tablename)
is None
Manage User Defined Functions?
Chose to use raw SQL files
3 directories, 128 files:
procs/ types/ views/
codepath = '/socorro/external/pg/raw_sql/procs'
def load_stored_proc(op, filelist):
app_path = os.getcwd() + codepath
for filename in filelist:
sqlfile = app_path + filename
with open(myfile, 'r') as stored_proc:
op.execute(stored_proc.read())
Stamping database revision?
from alembic.config import Config
from alembic import command
alembic_cfg =
Config("/path/to/yourapp/alembic.ini")
command.stamp(alembic_cfg, "head")
Part 4: Lessons Learned
Always roll forward.
1. Put migrations in a separate commit
from schema changes.
2. Revert commits for schema change,
leave migration commit in-place for
downgrade support.
Store schema objects in the smallest,
reasonable, composable unit.
1. Use an ORM for core schema.
2. Put types, UDFs and views in separate
files.
3. Consider storing the schema in a
separate repo from the application.
Write tests. Run them every time.
1. Write a simple tool to create a new
schema from scratch.
2. Write a simple tool to generate fake
data.
3. Write tests for these tools.
4.When anything fails, add a test.
Part 5: What Alembic could learn
1. Understand partitions
2. Never apply a DEFAULT to a new
column
3. Help us manage UDFs better
4.INDEX CONCURRENTLY
5. Prettier syntax for multi-commit
sequences
1. Understand partitions
2. Never apply a DEFAULT to a new
column
3. Help us manage UDFs better
4.INDEX CONCURRENTLY
5. Prettier syntax for multi-commit
sequences
Epilogue
No tool is perfect.
DBAs should drive migration tool
choice.
Chose a tool that your developers like.
Or, don't hate.
Other tools:
Sqitch
http://sqitch.org/
Written by PostgreSQL contributor
Erwin
http://erwin.com/
Commercial, popular with Oracle
South
http://south.aeracode.org/
Django-specific, well-supported
Alembic resources:
bitbucket.org/zzzeek/alembic
alembic.readthedocs.org
groups.google.com/group/
sqlalchemy-alembic
Sane Schema Management with
Alembic and SQLAlchemy
Selena Deckelmann
Mozilla
@selenamarie
chesnok.com

More Related Content

What's hot

Apache JMeter Introduction
Apache JMeter IntroductionApache JMeter Introduction
Apache JMeter Introduction
Søren Lund
 
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
seleniumconf
 
Ppt of soap ui
Ppt of soap uiPpt of soap ui
Ppt of soap ui
pkslide28
 

What's hot (19)

Selenium
SeleniumSelenium
Selenium
 
Oracle Unit Testing with utPLSQL
Oracle Unit Testing with utPLSQLOracle Unit Testing with utPLSQL
Oracle Unit Testing with utPLSQL
 
Selenium
SeleniumSelenium
Selenium
 
Introduction to SoapUI day 4-5
Introduction to SoapUI day 4-5Introduction to SoapUI day 4-5
Introduction to SoapUI day 4-5
 
All Aboard for Laravel 5.1
All Aboard for Laravel 5.1All Aboard for Laravel 5.1
All Aboard for Laravel 5.1
 
North east user group tour
North east user group tourNorth east user group tour
North east user group tour
 
Sencha Roadshow 2017: Best Practices for Implementing Continuous Web App Testing
Sencha Roadshow 2017: Best Practices for Implementing Continuous Web App TestingSencha Roadshow 2017: Best Practices for Implementing Continuous Web App Testing
Sencha Roadshow 2017: Best Practices for Implementing Continuous Web App Testing
 
Apache JMeter Introduction
Apache JMeter IntroductionApache JMeter Introduction
Apache JMeter Introduction
 
CollabSphere 2021 - DEV114 - The Nuts and Bolts of CI/CD With a Large XPages ...
CollabSphere 2021 - DEV114 - The Nuts and Bolts of CI/CD With a Large XPages ...CollabSphere 2021 - DEV114 - The Nuts and Bolts of CI/CD With a Large XPages ...
CollabSphere 2021 - DEV114 - The Nuts and Bolts of CI/CD With a Large XPages ...
 
Introduction to SoapUI day 1
Introduction to SoapUI day 1Introduction to SoapUI day 1
Introduction to SoapUI day 1
 
Eclipse workshop presentation (March 2016)
Eclipse workshop presentation (March 2016)Eclipse workshop presentation (March 2016)
Eclipse workshop presentation (March 2016)
 
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
Testing Rapidly Changing Applications With Self-Testing Object-Oriented Selen...
 
Story Testing Approach for Enterprise Applications using Selenium Framework
Story Testing Approach for Enterprise Applications using Selenium FrameworkStory Testing Approach for Enterprise Applications using Selenium Framework
Story Testing Approach for Enterprise Applications using Selenium Framework
 
FIT and JBehave - Good, Bad and Ugly
FIT and JBehave - Good, Bad and UglyFIT and JBehave - Good, Bad and Ugly
FIT and JBehave - Good, Bad and Ugly
 
Automated UI testing with Selenium
Automated UI testing with SeleniumAutomated UI testing with Selenium
Automated UI testing with Selenium
 
vodQA Pune (2019) - Browser automation using dev tools
vodQA Pune (2019) - Browser automation using dev toolsvodQA Pune (2019) - Browser automation using dev tools
vodQA Pune (2019) - Browser automation using dev tools
 
Unquoted service path exploitation
Unquoted service path exploitationUnquoted service path exploitation
Unquoted service path exploitation
 
Selenium Automation Framework
Selenium Automation  FrameworkSelenium Automation  Framework
Selenium Automation Framework
 
Ppt of soap ui
Ppt of soap uiPpt of soap ui
Ppt of soap ui
 

Viewers also liked

David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
PostgresOpen
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PostgresOpen
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
PostgresOpen
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
PostgresOpen
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
PostgresOpen
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
PostgresOpen
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
PostgresOpen
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres Open
PostgresOpen
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
PostgresOpen
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
PostgresOpen
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
PostgresOpen
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
elliando dias
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
PostgresOpen
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
PostgresOpen
 

Viewers also liked (20)

Introduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic MigrationsIntroduction to SQLAlchemy and Alembic Migrations
Introduction to SQLAlchemy and Alembic Migrations
 
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres OpenDavid Keeney - SQL Database Server Requests from the Browser @ Postgres Open
David Keeney - SQL Database Server Requests from the Browser @ Postgres Open
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
 
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
Gurjeet Singh - How Postgres is Different From (Better Tha) Your RDBMS @ Post...
 
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
Ryan Jarvinen Open Shift Talk @ Postgres Open 2013
 
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres OpenKevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
Kevin Kempter - PostgreSQL Backup and Recovery Methods @ Postgres Open
 
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres OpenKeith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
Keith Fiske - When PostgreSQL Can't, You Can @ Postgres Open
 
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
Henrietta Dombrovskaya - A New Approach to Resolve Object-Relational Impedanc...
 
Keith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres OpenKeith Paskett - Postgres on ZFS @ Postgres Open
Keith Paskett - Postgres on ZFS @ Postgres Open
 
Islamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuningIslamabad PUG - 7th Meetup - performance tuning
Islamabad PUG - 7th Meetup - performance tuning
 
Islamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuningIslamabad PUG - 7th meetup - performance tuning
Islamabad PUG - 7th meetup - performance tuning
 
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres OpenRobert Haas Query Planning Gone Wrong Presentation @ Postgres Open
Robert Haas Query Planning Gone Wrong Presentation @ Postgres Open
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres OpenSteve Singer - Managing PostgreSQL with Puppet @ Postgres Open
Steve Singer - Managing PostgreSQL with Puppet @ Postgres Open
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
 
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To FinishPoPostgreSQL Web Projects: From Start to FinishStart To Finish
PoPostgreSQL Web Projects: From Start to FinishStart To Finish
 
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres OpenKoichi Suzuki - Postgres-XC Dynamic Cluster  Management @ Postgres Open
Koichi Suzuki - Postgres-XC Dynamic Cluster Management @ Postgres Open
 
Gbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfsGbroccolo pgconfeu2016 pgnfs
Gbroccolo pgconfeu2016 pgnfs
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres OpenMichael Paquier - Taking advantage of custom bgworkers @ Postgres Open
Michael Paquier - Taking advantage of custom bgworkers @ Postgres Open
 

Similar to Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Postgres Open

Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12
Enkitec
 
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
camp_drupal_ua
 
PHP North-East - Automated Deployment
PHP North-East - Automated DeploymentPHP North-East - Automated Deployment
PHP North-East - Automated Deployment
Michael Peacock
 

Similar to Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Postgres Open (20)

SynapseIndia drupal presentation on drupal info
SynapseIndia drupal  presentation on drupal infoSynapseIndia drupal  presentation on drupal info
SynapseIndia drupal presentation on drupal info
 
KoprowskiT_Session2_SDNEvent_SourceControlForDBA
KoprowskiT_Session2_SDNEvent_SourceControlForDBAKoprowskiT_Session2_SDNEvent_SourceControlForDBA
KoprowskiT_Session2_SDNEvent_SourceControlForDBA
 
Session on evaluation of DevSecOps
Session on evaluation of DevSecOpsSession on evaluation of DevSecOps
Session on evaluation of DevSecOps
 
5 hs mpostcustomizationrenefonseca
5 hs mpostcustomizationrenefonseca5 hs mpostcustomizationrenefonseca
5 hs mpostcustomizationrenefonseca
 
Wellington MuleSoft Meetup 2021-02-18
Wellington MuleSoft Meetup 2021-02-18Wellington MuleSoft Meetup 2021-02-18
Wellington MuleSoft Meetup 2021-02-18
 
Strategies and Tips for Building Enterprise Drupal Applications - PNWDS 2013
Strategies and Tips for Building Enterprise Drupal Applications - PNWDS 2013Strategies and Tips for Building Enterprise Drupal Applications - PNWDS 2013
Strategies and Tips for Building Enterprise Drupal Applications - PNWDS 2013
 
DevOps Presentation.pptx
DevOps Presentation.pptxDevOps Presentation.pptx
DevOps Presentation.pptx
 
Handling Database Deployments
Handling Database DeploymentsHandling Database Deployments
Handling Database Deployments
 
JChem Microservices
JChem MicroservicesJChem Microservices
JChem Microservices
 
Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12Kelly potvin nosurprises_odtug_oow12
Kelly potvin nosurprises_odtug_oow12
 
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
Best Practices for Enterprise Continuous Delivery of Oracle Fusion Middlewa...
 
DevOps: Automate all the things
DevOps: Automate all the thingsDevOps: Automate all the things
DevOps: Automate all the things
 
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher ScientificEnabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
 
Liquibase få kontroll på dina databasförändringar
Liquibase   få kontroll på dina databasförändringarLiquibase   få kontroll på dina databasförändringar
Liquibase få kontroll på dina databasförändringar
 
OroCRM Partner Technical Training: September 2015
OroCRM Partner Technical Training: September 2015OroCRM Partner Technical Training: September 2015
OroCRM Partner Technical Training: September 2015
 
SANTOSH KUMAR M -FD
SANTOSH KUMAR M -FDSANTOSH KUMAR M -FD
SANTOSH KUMAR M -FD
 
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
Anna Fedoruk.Theworkflow.DrupalCamp Kyiv 2011
 
PHP North-East - Automated Deployment
PHP North-East - Automated DeploymentPHP North-East - Automated Deployment
PHP North-East - Automated Deployment
 
Automated Deployment
Automated DeploymentAutomated Deployment
Automated Deployment
 
Obevo Javasig.pptx
Obevo Javasig.pptxObevo Javasig.pptx
Obevo Javasig.pptx
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Selena Deckelmann - Sane Schema Management with Alembic and SQLAlchemy @ Postgres Open

  • 1. Sane Schema Management with Alembic and SQLAlchemy Selena Deckelmann Mozilla @selenamarie chesnok.com
  • 2. I work on Socorro. http://github.com/mozilla/socorro http://crash-stats.mozilla.com
  • 3. Thanks and apologies to Mike Bayer
  • 4. What's sane schema management? Executing schema change in a controled, repeatable way while working with developers and operations.
  • 5. What's alembic? Alembic is a schema migration tool that integrates with SQLAlchemy.
  • 6. My assumptions: ● Schema migrations are frequent. ● Automated schema migration is a goal. ● Stage environment is enough like production for testing. ● Writing a small amount of code is ok.
  • 7. No tool is perfect. DBAs should drive migration tool choice. Chose a tool that your developers like. Or, don't hate.
  • 8. Part 0: #dbaproblems Part 1: Why we should work with developers on migrations Part 2: Picking the right migration tool Part 3: Using Alembic Part 4: Lessons Learned Part 5: Things Alembic could learn
  • 10. Migrations are hard. And messy. And necessary.
  • 11.
  • 12. Changing a CHECK constraint on 1000+ partitions. http://tinyurl.com/q5cjh45
  • 13. What sucked about this: ● Wasn't the first time (see 2012 bugs) ● Change snuck into partitioning UDF Jan-April 2013 ● No useful audit trail ● Some partitions affected, not others ● Error dated back to 2010 ● Wake up call to examine process!
  • 14.
  • 16.
  • 17. What was awesome: ● Used Alembic to manage the change ● Tested in stage ● Experimentation revealed which partitions could be modified without deadlocking ● Rolled out change with a regular release during normal business hours
  • 18. Process with Alembic: 1. Make changes to model.py or raw_sql files 2. Run: alembic revision –-auto-generate 3. Edit revision file 4.Commit changes 5. Run migration on stage after auto-deploy of a release
  • 19. Process with Alembic: 1. Make changes to model.py or raw_sql files 2. Run: alembic revision -–auto-generate 3. Edit revision file 4.Commit changes 5. Run migration on stage after auto-deploy of a release
  • 20. Problems Alembic solved: ● Easy-to-deploy migrations including UDFs for dev and stage ● Can embed raw SQL, issue multi- commit changes ● Includes downgrades
  • 21. Problems Alembic solved: ● Enables database change discipline ● Enables code review discipline ● Revisions are decoupled from release versions and branch commit order
  • 22. Problems Alembic solved (continued): ● 100k+ lines of code removed ● No more post-deploy schema checkins ● Enabling a tested, automated stage deployment ● Separated schema definition from version-specific configuration
  • 23. Photo courtesy of secure.flickr.com/photos/lambj HAPPY AS A CAT IN A BOX
  • 24. Part I: Why we should work with developers on migrations
  • 27. Developers find this process really frustrating.
  • 28. Schemas, what are they good for? Signal intent Communicate ideal state of data Highly customizable in Postgres
  • 29. Schemas, what are they not so good for? Rapid iteration Documenting evolution Major changes on big data Data experimentation
  • 31. Database systems resist change because: Exist at the center of multiple systems Stability is a core competency Schema often is the only API between components
  • 32. How do we make changes to schemas?
  • 33. Because of resistance, we treat schema change as a one-off.
  • 34. Evolution of schema change process
  • 35. We're in charge of picking up the pieces when a poorly-executed schema change plan fails.
  • 36. Trick question: When is the right time to work with developers on a schema change?
  • 37. How do we safely make changes to schemas?
  • 38. How do we safely make changes to schemas? Process and tooling. Preferably, that we choose and implement.
  • 39. Migration tools are really configuration management tools.
  • 40. Migrations are for: ● Communicating change ● Communicating process ● Executing change in a controled, repeatable way with developers and operations
  • 41. Part 2: Picking the right migration tool
  • 42.
  • 43. Questions to ask: ● How often does your schema change? ● Can the migrations be run without you? ● Can you test a migration before you run it in production?
  • 44. Questions to ask: ● Can developers create a new schema without your help? ● How hard is it to get from an old schema to a new one using the tool? ● Are change rollbacks a standard use of the tool?
  • 45. What does our system need to do? ● Communicate change ● Apply changes in the correct order ● Apply a change only once ● Use raw SQL where needed ● Provide a single interface for change ● Rollback gracefully
  • 46. How you are going to feel about the next slide:
  • 47. Use an ORM with the migration tool.
  • 48. Shameful admission: We had three different ways of defining schema in our code and tests.
  • 49. A good ORM provides: ● One source of truth about the schema ● Reusable components ● Database version independence ● Ability to use raw SQL
  • 50. And good ORM stewardship: ● Fits with existing tooling and developer workflows ● Enables partnership with developers ● Integrates with a testing framework
  • 51. And: ● Gives you a new way to think about schemas ● Develops compassion for how horrible ORMs can be ● Gives you developer-friendly vocabulary for discussing why ORM- generated code is often terrible
  • 52. Part 3: Using Alembic
  • 53. Practical Guide to using Alembic http://tinyurl.com/po4mal6
  • 54. https://alembic.readthedocs.org revision: a single migration down_revision: previous migration upgrade: apply 'upgrade' change downgrade: apply 'downgrade' change offline mode: emit raw SQL for a change
  • 55. Installing and using: virtualenv venv-alembic . venv-alembic/bin/activate pip install alembic alembic init vi alembic.ini alembic revision -m “new” alembic upgrade head alembic downgrade -1
  • 56. Defining a schema? vi env.py Add: import myproj.model
  • 57. Helper functions? Put your helper functions in a custom library and add this to env.py: import myproj.migrations
  • 58. Ignore certain schemas or partitions? In env.py: def include_symbol(tablename, schema): return schema in (None, "bixie") and re.search(r'_d{8}$', tablename) is None
  • 59. Manage User Defined Functions? Chose to use raw SQL files 3 directories, 128 files: procs/ types/ views/ codepath = '/socorro/external/pg/raw_sql/procs' def load_stored_proc(op, filelist): app_path = os.getcwd() + codepath for filename in filelist: sqlfile = app_path + filename with open(myfile, 'r') as stored_proc: op.execute(stored_proc.read())
  • 60. Stamping database revision? from alembic.config import Config from alembic import command alembic_cfg = Config("/path/to/yourapp/alembic.ini") command.stamp(alembic_cfg, "head")
  • 61. Part 4: Lessons Learned
  • 62. Always roll forward. 1. Put migrations in a separate commit from schema changes. 2. Revert commits for schema change, leave migration commit in-place for downgrade support.
  • 63. Store schema objects in the smallest, reasonable, composable unit. 1. Use an ORM for core schema. 2. Put types, UDFs and views in separate files. 3. Consider storing the schema in a separate repo from the application.
  • 64. Write tests. Run them every time. 1. Write a simple tool to create a new schema from scratch. 2. Write a simple tool to generate fake data. 3. Write tests for these tools. 4.When anything fails, add a test.
  • 65. Part 5: What Alembic could learn
  • 66. 1. Understand partitions 2. Never apply a DEFAULT to a new column 3. Help us manage UDFs better 4.INDEX CONCURRENTLY 5. Prettier syntax for multi-commit sequences
  • 67. 1. Understand partitions 2. Never apply a DEFAULT to a new column 3. Help us manage UDFs better 4.INDEX CONCURRENTLY 5. Prettier syntax for multi-commit sequences
  • 69. No tool is perfect. DBAs should drive migration tool choice. Chose a tool that your developers like. Or, don't hate.
  • 70. Other tools: Sqitch http://sqitch.org/ Written by PostgreSQL contributor Erwin http://erwin.com/ Commercial, popular with Oracle South http://south.aeracode.org/ Django-specific, well-supported
  • 72. Sane Schema Management with Alembic and SQLAlchemy Selena Deckelmann Mozilla @selenamarie chesnok.com