Dave Page
This talk will give an insight into the infrastructure behind the PostgreSQL project - what servers do we have and what do they all do? How do we monitor and manage them? How does the website cope with the Slashdot effect on release days? We will also look at how things are likely to change as the Sysadmin and Web teams work to modernize and improve our infrastructure for the future.
1. I ns ide the P o s tg reS Q L
Presentation Title
P ro jec t I nfra s truc ture
Presentation Sub-Title
D ave P age
P ostgreS Q L C ore Team
S enior S oftware Architect, E nterpriseD B
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 1
2. In the beginning...
Date: Tue, 23 Apr 1996 16:06:10 -0400 (EDT)
From: "Marc G. Fournier" <scrappy@ki.net>
Subject: Re: [PG95]: postgres95 TODO list posted on the web
To: Chad Robinson <chadr@brttech.com>
cc: Jolly Chen <jolly@postgres.berkeley.edu>, postgres95@shiloh.vnet.net
…
...
If it helps, I’d be willing to setup a cvs database, including appropriate
accounts for a core few developers that patches can go through.
From there, it wouldn’t be too hard to do a weekly "distribution" that is
ftpable.
I don’t know enough about the server backend to offer much more then
that :(
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 2
3. The first server
Hosted in Toronto, C anada
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 3
4. E arly services
• M ailing lists
• C VS repository
• FTP site
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 4
5. 14 Years later...
• 20+ P hysical servers
• 35+ Virtual M achines
• Hosted in:
– France
– P anama C ity
– Austria
– C anada
– US A (4 independent locations)
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 5
6. C urrent services
• FTP site
• Website
• S ource control – C VS and G IT
• M ailing lists
• Wiki
• M ailing list archives
• Website/ archives search
• pgFoundry
• C ommitfest management server
• Buildfarm and Hudson servers
• D evelopment servers… and more!
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 6
7. O S “zoo”
• P rimarily using FreeB S D jails:
– E asy to backup
– E asy to relocate
– P er-function jails
• Also running:
– Ubuntu
– S lackware
– C entO S
– Windows
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 7
8. P roblems
• Little FreeBS D experience in the community.
• FreeB S D P orts are ha rd to upgrade, especially with lots
of jails.
• Hosting companies don't tend to like FreeBS D – and we
can't be too picky!
• No centralised management or deployment.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 8
9. New infrastructure
• R uns D ebian virtual machines under KVM on D ebian
hosts.
• P re-built packages setup base hosts and VM s.
• M anagement system automates:
– VM creation and configuration.
– Addition and removal of user accounts and S S H keys.
– P ackage installation and upgrades.
– D etection of unexpected user accounts or unauthorised services.
– S etup and configuration of Nagios and M unin monitoring.
– Auto-backup configuration
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 9
10. M onitoring
• Nagios
• M unin
• S mokeping
• Auto-backup
• G oogle Analytics
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 10
11. Nagios
• 64 Hosts
• 514 S ervices
• S ervice checks include:
– S ervice availability – NTP, S S H, HTTP, FTP, R S YNC etc.
– Utilisation – disk usage, logged in us ers, proces ses, mail queue
– M anagement – software update availability
– “O ur stuff” - buildfarm status, search indexer, database backups
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 11
12. Nagios
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 12
13. M unin
• M onitors resource trends:
– D isk us age
– Network utilisation
– P rocesses
– S endmail/ ostfix stats
P
– C P U/ emory utilisation
M
– Apache stats
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 13
14. M unin
• C P U usage – postgresql01.managed.contegix.com
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 14
15. S mokeping
• M onitors network latency to various hosts from the
C onova C ommunications data centre in S alzburg,
Austria
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 15
16. Auto-backup
• Automatically backs up changes to key configuration
files to S ubversion.
– G ives us a simple backup of config files
– Allows us to trace the history of changes to a file
• Alerts the sysadmins to changes to monitored files
– Helps us see what the other team members are doing
– Acts as a simple Intrusion Detection System
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 16
17. G oogle Analytics
• M onitors website utilisation.
• Helps us understand how the website is used.
• C an be hampered by disabled scripting support in
browsers, common with computer geeks!
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 17
18. G oogle Analytics
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 18
19. FTP S ite
• P rimary site: ftp.postgresql.org
• 62 regional mirrors in 39 countries
• M irrors may also serve content via:
– HTTP (supported)
– R S YNC (unsupported)
• C ontent includes main FTP site, and pgFoundry
downloads
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 19
20. FTP M irror monitoring
• All mirrors are checked daily by the 'mirrorbot'
• The mirrorbot checks that content is up to date:
– Fresh mirrors have a D NS hostname, e.g. ftp.uk.postgresql.org
– Fresh mirrors are listed on the website for users to choose
• O ut of date or broken mirrors:
– Are automatically removed from the website and D NS .
– Are reported to their maintainers via email.
– Are automatically purged from the system if un-fixed.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 20
21. Website infrastructure
• D eveloped following the great 8.0 S lashdotting incident.
• C apable of handling high-load scenarios on release
days..
• M inimised points of failure.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 21
22. wwwmaster.postgresql.org
• D ynamic, master server.
• R uns custom-built P HP framework for:
– S tatic page rendering (general content)
– D ynamic page rendering (docs, news, events etc)
– Form processing
• D ynamic content stored in P ostgreS QL.
• S tatic version of content generated hourly by a spider,
and pushed via R S YNC to the static servers.
• Users redirected back to static servers where possible.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 22
23. www.postgresql.org
• S tatic, slave servers
• S erve HTM L, C S S , images and files such as P D Fs.
• C urrently 2 servers.
• G eographically diverse.
• R ound-robin load balanced via D NS .
• M onitoring system dynamically removes servers from
D NS within minutes of a failure.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 23
24. Website problems
• P HP framework is complex, and understood by few.
• Adding new dynamic content can require significant
effort to build administration pages.
• The framework includes lots of features and code we
thought we needed, but then never used.
• S pider can take hours to process the entire site.
• S pidering the site is very inefficient.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 24
25. New website infrastructure
• S limmed down and vastly simplified framework, built
using D jango and P ython.
• D jango's administration module makes it easy to add
and manage content.
• S pider and static slaves will be replaced with Varnish
cache s ervers:
– P ages dynamically cached from wwwmaster on first request.
– Last available content served if wwwmaster goes down.
– C ache invalidation of individual or groups of pages as changed
on wwwmaster.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 25
26. Q uestions?
Thank you.
D ave P age, 25 th M arch 2010 Inside the P ostgreS Q L P roject Infrastructure S lide: 26