SlideShare uma empresa Scribd logo
1 de 49
Baixar para ler offline
Advanced Deployment
Scotland on Rails 2009




Jonathan Weiss, 28 March 2009
Peritor GmbH
Who am I?

Jonathan Weiss

•  Consultant for Peritor GmbH in Berlin
•  Specialized in Rails, Scaling, Deployment, and Code Review
•  Webistrano - Rails deployment tool
•  FreeBSD Rubygems and Ruby on Rails maintainer




http://www.peritor.com
http://blog.innerewut.de



                                                                2
Deployment




                       Deployment


             Process            Architecture




                                               3
Deployment Process Requirements




                 Reproducible   Accountable   Notifications
     Automatic




                                                             4
Deployment Tools

Several tools available
 •  Capistrano
 •  Webistrano
 •  Vlad
 •  Puppet
 •  Chef


The deployment process is usually not that complicated




                                                         5
Architecture




               6
How deployment starts out …




                              7
… and how it ends




                    8
Agenda

Search
Background Processing
Scaling the database
Multiple Client Installations
Cloud Infrastructure




                                9
General Advice
               -
Simple is better than complex




                                10
Search




         11
Search

Full text search




Can become very slow on big data sets




                                        12
Full Text Search Engine

Separate Service
 •  Creates full text index
 •  Application queries search daemon
 •  Index update through application or
    database


Possible Engines
 •  Ferret
 •  Sphinx
 •  Solr
 •  Lucene
 •  …

                                          13
Search Slave

Database replication slave
 •  Has complete dataset
 •  Migrates slow search queries from master
 •  Can use different database table engine




                                               14
Database Index

PostgreSQL Tsearch2
 •  Core since 8.3
 •  Allows to create full text index on multiple columns
    or arbitrary SQL expressions


MySQL MyISAM FULLTEXT index
 •  Only works with MySQL <= 5.0 and MyISAM tables
 •  Full text index on multiple columns




                                                           15
What to use?

Different characteristics
 •  Real-time updates and stale data
 •  Lost updates
 •  Performance
 •  Document content and format
 •  Complexity




                                       16
Background Processing




                        17
Problem

Long running tasks
 •  Resizing uploaded images
 •  Mailing
 •  Computing an expensive operation
 •  Accessing slow back-ends




When running inside request-response-cycle
 •  Blocks user
 •  Blocks Rails instance
 •  Hard to monitor and debug



                                             18
Solution

Asynchronous processing in the background




      Message/Queue                         Scheduler



                                                        19
Background Processing




                        20
Options



 Options for message bus:   Options for background process:
 •  Database                •  (Ruby) Daemon
 •  Amazon SQS              •  Cron job with script/runner
 •  Drb                     •  Forked process
 •  Memcache                •  Delayed Job / BJ / (Backgroundrb)
 •  ActiveMQ                •  run_later
 •  …                       •  ….




                                                                   21
Database/Ruby daemon example




                               22
Scaling the database




                       23
Scaling the database

One database for everything
 •  All domain data in one place
 •  The simplest solution




Problems at some point
 •  Number of read and write requests
 •  Data size




                                        24
Scaling the database

Read Slave
 •  Slave replicates each SQL-statement
    on the master
 •  Increase read performance by reading
    from replicating slave
 •  Stale read problem
 •  Better used explicitly,
    but then makes you think




         Better use
         memcached



                                           25
Scaling the database

Master-Master
 •  Increase write and read performance
 •  Each server is a slave of the other
 •  Synchronization can be tricky
 •  Limited by database size




        Better for HA than for
        write performance



                                          26
Data Partitioning

Partition on domain models
 •  Separate users and products
 •  Makes sense if JOINs are rare
 •  Scales reads/writes
 •  Reduces data size per database
 •  Depends on separate domains




        Simple and
        effective



                                     27
Data Partitioning

Sharding
 •  Split data into shards
     •  All tables
     •  Only big ones like users
 •  Partition by id, hash function or lookup
 •  Complex and makes JOINs complicated
 •  Scales reads/writes
 •  Reduces data size per database




                                               28
Data Partitioning

Sharding
 •  Split data into shards
     •  All tables
     •  Only big ones like users
 •  Partition by id, hash function or lookup
 •  Complex and makes JOINs complicated
 •  Scales reads/writes
 •  Reduces data size per database



        Last resort




                                               29
Alternatives

Data size is often the bigger problem




             Archiving                  Reduce data size



                                                           30
Archiving

Get rid of (historical) data
 •  Delete old data
 •  Aggregate old data
 •  Partition old data


Have an archiving policy from the start




                                          31
Reduce data size

Avoid exponential data growth
 •  Do not store data in database, move to
    •  File system
    •  S3
    •  SimpleDB
 •  Do not normalize data
    •  Duplicate data in order to remove JOINs (and JOIN tables)
 •  Combine indices




                                                                   32
Multiple clients




                   33
Multiple Clients

NOT the same as multiple users


Client is more like a separate domain – i.e. expansion to another country
 •  Different settings
 •  Different themes
 •  Different features enabled
 •  Different language
 •  Different audience



How to combine in one app?


                                                                            34
Multiple Clients

Questions to ask
 •  How many different clients?
 •  Is there shared state (users, settings, posts, …)?
 •  What is the expected data size and growth of each client?




                                                                35
Multiple Clients

The easy way to maintenance hell
 •  Fork the code
 •  One branch per client
 •  One install per client




                                   36
Multiple Clients

Same code – same database
 •  Move different behavior into configuration
 •  Move configuration into database
 •  Scope data by DB-column
 •  Scope all data request in the code




                                                37
Multiple Clients

Same code – partition the data
 •  Move different behavior into configuration
 •  Partition data by database




Hardcode database while booting

                                                38
Multiple Clients

Same code – partition the data
 •  Move different behavior into configuration
 •  Partition data by database




  Choose database dynamically

                                                39
Multiple Clients

Generate local databases
 •  Import global content into master DB
 •  Push shared content in the correct
    format to app DBs
 •  Build reverse channel if needed




                                           40
Cloud Infrastructure




                       41
Cloud Infrastructure

Servers come and go
 •  You do not know your servers before deploying
 •  Restarting is the same as introducing a new machine




You can’t hardcode IPs
 database.yml




                                                          42
Solution #1

Query and manually adjust
 •  Servers do not change that often
 •  New nodes probably need manual intervention
 •  Use AWS ElasticIPs to ease the pain




Set servers dynamically                      AWS Elastic IP




                                                              43
Solution #2

Use a central directory service
 •  A central place to manage your running instances
 •  Instances query the directory and react




                                                       44
Solution #2

Use a central directory service
 •  A central place to manage your running instances
 •  Instances query the directory and react




                                                       45
Central Directory

Different Implementations
 •  File on S3
 •  SimpleDB
 •  A complete service,
    capable of monitoring and controlling your instances




                                                           46
Summary

Simple is better than complex


Carefully evaluate the different solutions


Only introduce a new component if you really need to


Everything has strings attached


Solving the data size problem often solves others too


                                                        47
Questions?




             48
Peritor GmbH

Teutonenstraße 16
14129 Berlin
Telefon: +49 (0)30 69 20 09 84 0
Telefax: +49 (0)30 69 20 09 84 9

Internet: www.peritor.com
E-Mail: kontakt@peritor.com




                                          49
Peritor GmbH - Alle Rechte vorbehalten        49

Mais conteúdo relacionado

Mais procurados

NIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V ReplicaNIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V Replica
Kristian Nese
 
Scott Schnoll - Exchange server 2013 virtualization best practices
Scott Schnoll - Exchange server 2013 virtualization best practicesScott Schnoll - Exchange server 2013 virtualization best practices
Scott Schnoll - Exchange server 2013 virtualization best practices
Nordic Infrastructure Conference
 
Galera Multi Master Synchronous My S Q L Replication Clusters
Galera  Multi Master  Synchronous  My S Q L  Replication  ClustersGalera  Multi Master  Synchronous  My S Q L  Replication  Clusters
Galera Multi Master Synchronous My S Q L Replication Clusters
PerconaPerformance
 
My sql replication advanced techniques presentation
My sql replication advanced techniques presentationMy sql replication advanced techniques presentation
My sql replication advanced techniques presentation
epee
 
VMware Performance for Gurus - A Tutorial
VMware Performance for Gurus - A TutorialVMware Performance for Gurus - A Tutorial
VMware Performance for Gurus - A Tutorial
Richard McDougall
 

Mais procurados (20)

TechNet Live spor 1 sesjon 6 - more vdi
TechNet Live spor 1   sesjon 6 - more vdiTechNet Live spor 1   sesjon 6 - more vdi
TechNet Live spor 1 sesjon 6 - more vdi
 
Scaling the Rails
Scaling the RailsScaling the Rails
Scaling the Rails
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
NIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V ReplicaNIC 2013 - Hyper-V Replica
NIC 2013 - Hyper-V Replica
 
Scott Schnoll - Exchange server 2013 virtualization best practices
Scott Schnoll - Exchange server 2013 virtualization best practicesScott Schnoll - Exchange server 2013 virtualization best practices
Scott Schnoll - Exchange server 2013 virtualization best practices
 
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
Технологии работы с дисковыми хранилищами и файловыми системами Windows Serve...
 
Built-in Replication in PostgreSQL
Built-in Replication in PostgreSQLBuilt-in Replication in PostgreSQL
Built-in Replication in PostgreSQL
 
Scaling Xen within Rackspace Cloud Servers
Scaling Xen within Rackspace Cloud ServersScaling Xen within Rackspace Cloud Servers
Scaling Xen within Rackspace Cloud Servers
 
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security[KGC 2012] Online Game Server Architecture Case Study Performance and Security
[KGC 2012] Online Game Server Architecture Case Study Performance and Security
 
The SQL Stack Design And Configurations
The SQL Stack Design And ConfigurationsThe SQL Stack Design And Configurations
The SQL Stack Design And Configurations
 
Scale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCP Scale11x : Virtualization with Xen and XCP
Scale11x : Virtualization with Xen and XCP
 
Galera Multi Master Synchronous My S Q L Replication Clusters
Galera  Multi Master  Synchronous  My S Q L  Replication  ClustersGalera  Multi Master  Synchronous  My S Q L  Replication  Clusters
Galera Multi Master Synchronous My S Q L Replication Clusters
 
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server VMworld 2013: Successfully Virtualize Microsoft Exchange Server
VMworld 2013: Successfully Virtualize Microsoft Exchange Server
 
Windows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V ReplicaWindows Server 2012 R2 Hyper-V Replica
Windows Server 2012 R2 Hyper-V Replica
 
Master VMware Performance and Capacity Management
Master VMware Performance and Capacity ManagementMaster VMware Performance and Capacity Management
Master VMware Performance and Capacity Management
 
LCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackLCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with Openstack
 
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
Always On - Wydajność i bezpieczeństwo naszych danych - High Availability SQL...
 
My sql replication advanced techniques presentation
My sql replication advanced techniques presentationMy sql replication advanced techniques presentation
My sql replication advanced techniques presentation
 
Xen and Apache cloudstack
Xen and Apache cloudstack  Xen and Apache cloudstack
Xen and Apache cloudstack
 
VMware Performance for Gurus - A Tutorial
VMware Performance for Gurus - A TutorialVMware Performance for Gurus - A Tutorial
VMware Performance for Gurus - A Tutorial
 

Semelhante a Advanced Deployment

From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Cluster
guestd34230
 
[Roblek] Distributed computing in practice
[Roblek] Distributed computing in practice[Roblek] Distributed computing in practice
[Roblek] Distributed computing in practice
javablend
 
High Availability with MySQL
High Availability with MySQLHigh Availability with MySQL
High Availability with MySQL
Thava Alagu
 
Brian Oliver Pimp My Data Grid
Brian Oliver  Pimp My Data GridBrian Oliver  Pimp My Data Grid
Brian Oliver Pimp My Data Grid
deimos
 
2011 Db Intro
2011 Db Intro2011 Db Intro
2011 Db Intro
atali
 
Scalarea Aplicatiilor Web - 2009
Scalarea Aplicatiilor Web - 2009Scalarea Aplicatiilor Web - 2009
Scalarea Aplicatiilor Web - 2009
Andrei Gheorghe
 

Semelhante a Advanced Deployment (20)

From One to a Cluster
From One to a ClusterFrom One to a Cluster
From One to a Cluster
 
Couch Db
Couch DbCouch Db
Couch Db
 
MySQL Aquarium Paris
MySQL Aquarium ParisMySQL Aquarium Paris
MySQL Aquarium Paris
 
[Roblek] Distributed computing in practice
[Roblek] Distributed computing in practice[Roblek] Distributed computing in practice
[Roblek] Distributed computing in practice
 
Magee Dday2 Fixing App Performance Italiano
Magee Dday2 Fixing App Performance ItalianoMagee Dday2 Fixing App Performance Italiano
Magee Dday2 Fixing App Performance Italiano
 
How to build a state-of-the-art rails cluster
How to build a state-of-the-art rails clusterHow to build a state-of-the-art rails cluster
How to build a state-of-the-art rails cluster
 
Deployment with Ruby on Rails
Deployment with Ruby on RailsDeployment with Ruby on Rails
Deployment with Ruby on Rails
 
Instant J Chem: one-stop information hub for medicinal chemists: US UGM 2008
Instant J Chem: one-stop information hub for medicinal chemists: US UGM 2008Instant J Chem: one-stop information hub for medicinal chemists: US UGM 2008
Instant J Chem: one-stop information hub for medicinal chemists: US UGM 2008
 
Evergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival SkillsEvergreen Sysadmin Survival Skills
Evergreen Sysadmin Survival Skills
 
Practical MySQL
Practical MySQLPractical MySQL
Practical MySQL
 
Nevmug Lighthouse Automation7.1
Nevmug   Lighthouse   Automation7.1Nevmug   Lighthouse   Automation7.1
Nevmug Lighthouse Automation7.1
 
High Availability with MySQL
High Availability with MySQLHigh Availability with MySQL
High Availability with MySQL
 
All The Little Pieces
All The Little PiecesAll The Little Pieces
All The Little Pieces
 
Brian Oliver Pimp My Data Grid
Brian Oliver  Pimp My Data GridBrian Oliver  Pimp My Data Grid
Brian Oliver Pimp My Data Grid
 
2011 Db Intro
2011 Db Intro2011 Db Intro
2011 Db Intro
 
Scalabe MySQL Infrastructure
Scalabe MySQL InfrastructureScalabe MySQL Infrastructure
Scalabe MySQL Infrastructure
 
App301 Implement a Data Access Layer with Ent Lib
App301 Implement a Data Access Layer with Ent LibApp301 Implement a Data Access Layer with Ent Lib
App301 Implement a Data Access Layer with Ent Lib
 
Qure Tech Presentation
Qure Tech PresentationQure Tech Presentation
Qure Tech Presentation
 
MySQL Tuning
MySQL TuningMySQL Tuning
MySQL Tuning
 
Scalarea Aplicatiilor Web - 2009
Scalarea Aplicatiilor Web - 2009Scalarea Aplicatiilor Web - 2009
Scalarea Aplicatiilor Web - 2009
 

Mais de Jonathan Weiss

DevOpsDays Amsterdam - Observations in the cloud
DevOpsDays Amsterdam - Observations in the cloudDevOpsDays Amsterdam - Observations in the cloud
DevOpsDays Amsterdam - Observations in the cloud
Jonathan Weiss
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010
Jonathan Weiss
 

Mais de Jonathan Weiss (20)

Docker on AWS OpsWorks
Docker on AWS OpsWorksDocker on AWS OpsWorks
Docker on AWS OpsWorks
 
ChefConf 2014 - AWS OpsWorks Under The Hood
ChefConf 2014 - AWS OpsWorks Under The HoodChefConf 2014 - AWS OpsWorks Under The Hood
ChefConf 2014 - AWS OpsWorks Under The Hood
 
AWS OpsWorks & Chef at the Hamburg Chef User Group 2014
AWS OpsWorks & Chef at the Hamburg Chef User Group 2014AWS OpsWorks & Chef at the Hamburg Chef User Group 2014
AWS OpsWorks & Chef at the Hamburg Chef User Group 2014
 
DevOpsDays Amsterdam - Observations in the cloud
DevOpsDays Amsterdam - Observations in the cloudDevOpsDays Amsterdam - Observations in the cloud
DevOpsDays Amsterdam - Observations in the cloud
 
Amazon SWF and Gordon
Amazon SWF and GordonAmazon SWF and Gordon
Amazon SWF and Gordon
 
Introduction to Backbone.js
Introduction to Backbone.jsIntroduction to Backbone.js
Introduction to Backbone.js
 
Scalarium and CouchDB
Scalarium and CouchDBScalarium and CouchDB
Scalarium and CouchDB
 
Build your own clouds with Chef and MCollective
Build your own clouds with Chef and MCollectiveBuild your own clouds with Chef and MCollective
Build your own clouds with Chef and MCollective
 
NoSQL - Motivation and Overview
NoSQL - Motivation and OverviewNoSQL - Motivation and Overview
NoSQL - Motivation and Overview
 
NoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDBNoSQL - An introduction to CouchDB
NoSQL - An introduction to CouchDB
 
Amazon EC2 in der Praxis
Amazon EC2 in der PraxisAmazon EC2 in der Praxis
Amazon EC2 in der Praxis
 
Infrastructure Automation with Chef
Infrastructure Automation with ChefInfrastructure Automation with Chef
Infrastructure Automation with Chef
 
CouchDB on Rails
CouchDB on RailsCouchDB on Rails
CouchDB on Rails
 
Rails in the Cloud - Experiences from running on EC2
Rails in the Cloud - Experiences from running on EC2Rails in the Cloud - Experiences from running on EC2
Rails in the Cloud - Experiences from running on EC2
 
CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010CouchDB on Rails - RailsWayCon 2010
CouchDB on Rails - RailsWayCon 2010
 
CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010CouchDB on Rails - FrozenRails 2010
CouchDB on Rails - FrozenRails 2010
 
NoSQL - Post-Relational Databases - BarCamp Ruhr3
NoSQL - Post-Relational Databases - BarCamp Ruhr3NoSQL - Post-Relational Databases - BarCamp Ruhr3
NoSQL - Post-Relational Databases - BarCamp Ruhr3
 
Ruby on CouchDB - SimplyStored and RockingChair
Ruby on CouchDB - SimplyStored and RockingChairRuby on CouchDB - SimplyStored and RockingChair
Ruby on CouchDB - SimplyStored and RockingChair
 
No SQL - BarCamp Nürnberg 2010
No SQL - BarCamp Nürnberg 2010No SQL - BarCamp Nürnberg 2010
No SQL - BarCamp Nürnberg 2010
 
BarCamp Nürnberg - Infrastructure As A Service
BarCamp Nürnberg - Infrastructure As A ServiceBarCamp Nürnberg - Infrastructure As A Service
BarCamp Nürnberg - Infrastructure As A Service
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Advanced Deployment

  • 1. Advanced Deployment Scotland on Rails 2009 Jonathan Weiss, 28 March 2009 Peritor GmbH
  • 2. Who am I? Jonathan Weiss •  Consultant for Peritor GmbH in Berlin •  Specialized in Rails, Scaling, Deployment, and Code Review •  Webistrano - Rails deployment tool •  FreeBSD Rubygems and Ruby on Rails maintainer http://www.peritor.com http://blog.innerewut.de 2
  • 3. Deployment Deployment Process Architecture 3
  • 4. Deployment Process Requirements Reproducible Accountable Notifications Automatic 4
  • 5. Deployment Tools Several tools available •  Capistrano •  Webistrano •  Vlad •  Puppet •  Chef The deployment process is usually not that complicated 5
  • 8. … and how it ends 8
  • 9. Agenda Search Background Processing Scaling the database Multiple Client Installations Cloud Infrastructure 9
  • 10. General Advice - Simple is better than complex 10
  • 11. Search 11
  • 12. Search Full text search Can become very slow on big data sets 12
  • 13. Full Text Search Engine Separate Service •  Creates full text index •  Application queries search daemon •  Index update through application or database Possible Engines •  Ferret •  Sphinx •  Solr •  Lucene •  … 13
  • 14. Search Slave Database replication slave •  Has complete dataset •  Migrates slow search queries from master •  Can use different database table engine 14
  • 15. Database Index PostgreSQL Tsearch2 •  Core since 8.3 •  Allows to create full text index on multiple columns or arbitrary SQL expressions MySQL MyISAM FULLTEXT index •  Only works with MySQL <= 5.0 and MyISAM tables •  Full text index on multiple columns 15
  • 16. What to use? Different characteristics •  Real-time updates and stale data •  Lost updates •  Performance •  Document content and format •  Complexity 16
  • 18. Problem Long running tasks •  Resizing uploaded images •  Mailing •  Computing an expensive operation •  Accessing slow back-ends When running inside request-response-cycle •  Blocks user •  Blocks Rails instance •  Hard to monitor and debug 18
  • 19. Solution Asynchronous processing in the background Message/Queue Scheduler 19
  • 21. Options Options for message bus: Options for background process: •  Database •  (Ruby) Daemon •  Amazon SQS •  Cron job with script/runner •  Drb •  Forked process •  Memcache •  Delayed Job / BJ / (Backgroundrb) •  ActiveMQ •  run_later •  … •  …. 21
  • 24. Scaling the database One database for everything •  All domain data in one place •  The simplest solution Problems at some point •  Number of read and write requests •  Data size 24
  • 25. Scaling the database Read Slave •  Slave replicates each SQL-statement on the master •  Increase read performance by reading from replicating slave •  Stale read problem •  Better used explicitly, but then makes you think Better use memcached 25
  • 26. Scaling the database Master-Master •  Increase write and read performance •  Each server is a slave of the other •  Synchronization can be tricky •  Limited by database size Better for HA than for write performance 26
  • 27. Data Partitioning Partition on domain models •  Separate users and products •  Makes sense if JOINs are rare •  Scales reads/writes •  Reduces data size per database •  Depends on separate domains Simple and effective 27
  • 28. Data Partitioning Sharding •  Split data into shards •  All tables •  Only big ones like users •  Partition by id, hash function or lookup •  Complex and makes JOINs complicated •  Scales reads/writes •  Reduces data size per database 28
  • 29. Data Partitioning Sharding •  Split data into shards •  All tables •  Only big ones like users •  Partition by id, hash function or lookup •  Complex and makes JOINs complicated •  Scales reads/writes •  Reduces data size per database Last resort 29
  • 30. Alternatives Data size is often the bigger problem Archiving Reduce data size 30
  • 31. Archiving Get rid of (historical) data •  Delete old data •  Aggregate old data •  Partition old data Have an archiving policy from the start 31
  • 32. Reduce data size Avoid exponential data growth •  Do not store data in database, move to •  File system •  S3 •  SimpleDB •  Do not normalize data •  Duplicate data in order to remove JOINs (and JOIN tables) •  Combine indices 32
  • 34. Multiple Clients NOT the same as multiple users Client is more like a separate domain – i.e. expansion to another country •  Different settings •  Different themes •  Different features enabled •  Different language •  Different audience How to combine in one app? 34
  • 35. Multiple Clients Questions to ask •  How many different clients? •  Is there shared state (users, settings, posts, …)? •  What is the expected data size and growth of each client? 35
  • 36. Multiple Clients The easy way to maintenance hell •  Fork the code •  One branch per client •  One install per client 36
  • 37. Multiple Clients Same code – same database •  Move different behavior into configuration •  Move configuration into database •  Scope data by DB-column •  Scope all data request in the code 37
  • 38. Multiple Clients Same code – partition the data •  Move different behavior into configuration •  Partition data by database Hardcode database while booting 38
  • 39. Multiple Clients Same code – partition the data •  Move different behavior into configuration •  Partition data by database Choose database dynamically 39
  • 40. Multiple Clients Generate local databases •  Import global content into master DB •  Push shared content in the correct format to app DBs •  Build reverse channel if needed 40
  • 42. Cloud Infrastructure Servers come and go •  You do not know your servers before deploying •  Restarting is the same as introducing a new machine You can’t hardcode IPs database.yml 42
  • 43. Solution #1 Query and manually adjust •  Servers do not change that often •  New nodes probably need manual intervention •  Use AWS ElasticIPs to ease the pain Set servers dynamically AWS Elastic IP 43
  • 44. Solution #2 Use a central directory service •  A central place to manage your running instances •  Instances query the directory and react 44
  • 45. Solution #2 Use a central directory service •  A central place to manage your running instances •  Instances query the directory and react 45
  • 46. Central Directory Different Implementations •  File on S3 •  SimpleDB •  A complete service, capable of monitoring and controlling your instances 46
  • 47. Summary Simple is better than complex Carefully evaluate the different solutions Only introduce a new component if you really need to Everything has strings attached Solving the data size problem often solves others too 47
  • 49. Peritor GmbH Teutonenstraße 16 14129 Berlin Telefon: +49 (0)30 69 20 09 84 0 Telefax: +49 (0)30 69 20 09 84 9 Internet: www.peritor.com E-Mail: kontakt@peritor.com 49 Peritor GmbH - Alle Rechte vorbehalten 49