My talk about the current state of the rubygems infrastructure, problems, possible solutions.
The intention behind this talk is to make people care about a problem and join forces to fix them. It's not about blaming anyone who spents her/his time for doing open source work!
2. http://moriz.de/Rubygems behind the gems.
Hello blaaa bla Moriz GmbH bla bla Software Development
Services bla bla bla bla Consulting bla blaaaa bla blaaaaaa
bla Infrastructure Services bla bla Roland bla bla bla bla
professional software development since 1999 bla bla
Amazon Marketplace Deutschland bla bla bla Tiscali
Games bla bla FIFA WM 2006 bla Yahoo.de bla bla bla bla
two billion pageviews bla bla blala Allianz24.de/
Allsecur.de bla bla bla Ruby User Group München bla bla
blabla http://moriz.de/ bla blaaaba http://rails.io bla
http://boot.io blablabla recently hetzner-api gem bla bla
bla and the slides will be available @ http://moriz.de/talks/
rubygems.
;-)
4. http://moriz.de/Rubygems behind the gems.
RUBYGEMS MOVING PARTS
rubygems / cli gemcutter
$ gem
require ”rubygems“
http://rubygems.org/
(and extensions to the rubygems client)
distribution
creation, download, setup,
usage (index building, server)
5. http://moriz.de/Rubygems behind the gems.
RUBYGEMS FACTS
• used by nearly every ruby project
• the core of the ruby ecosystem
• standard lib (with MRI 1.9.x)
• 17.000+ gem projects
• 81.000+ gem files
• 23 GB+
6. http://moriz.de/
started at RubyConf 2003 by:
• Rich Kilmer
• Chad Fowler
• David Black
• Paul Brannan
• Jim Weirch
> http://rubyforge.org/projects/rubygems/
Rubygems behind the gems.
RUBYGEMS FACTS
7. http://moriz.de/Rubygems behind the gems.
GEM FACTS
• described by a .gemspec
• gem build my.gemspec
easier ways:
• bundler, jewler, newgem(?), ...
10. http://moriz.de/Rubygems behind the gems.
GEMCUTTER FACTS
• started in April 2009
• is now rubygems.org (rubygems 1.3.6+)
• replaced rubyforge
• manages uploads & downloads
• rails app using PostgreSQL +
rack middleware with sinatra
• by Nick Quaranto (@qrush) of Thoughtbot
> http://github.com/rubygems/gemcutter
11. http://moriz.de/Rubygems behind the gems.
BIG PICTURE: UPLOAD RELEASE
$ gem release hetzner-api.gemspec
Successfully built RubyGem
Name: hetzner-api
Version: 1.0.0
File: hetzner-api-1.0.0.gem
Pushing gem to RubyGems.org...
gem release
17. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
$sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
18. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“no specific version
=> latest
$sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
19. $sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
Gem.marshal_version
=> Marshal::MAJOR_VERSION
Marshal::MINOR_VERSION
20. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
irb(main):001:0> x = {}
=> {}
irb(main):002:0> x['farbe'] = 'ananasblau'
=> "ananasblau"
irb(main):003:0> Marshal.dump x
=> "004b{006"nfarbe"017ananasblau"
etc.
> http://ruby-doc.org/core/classes/Marshal.html
21. $sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
22. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
latest_specs:
lists the latest release number of all gems (~150 KB / 570 KB)
specs:
list of all gem releases (380 KB / 2.2 MB)
latest_specs = Marshal.load open 'latest_specs.4.8'
latest_specs.size
=> 17501
specs = Marshal.load open 'specs.4.8'; specs.size
=> 83490
(there‘s also a pre-release spec (remember „gem install rails --pre“) and others: see
rubygems source lib/rubygems/commands/generate_index_command.rb)
23. $sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
xload + parse spec
dependencies
24. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
Gem::Specification.new do |s|
s.authors = ["David Heinemeier Hansson"]
s.date = Time.utc(2010, 10, 14)
s.dependencies = [Gem::Dependency.new("activesupport",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("actionpack",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("activerecord",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("activeresource",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("actionmailer",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("railties",
Gem::Requirement.new(["= 3.0.1"]),
:runtime),
Gem::Dependency.new("bundler",
Gem::Requirement.new(["~> 1.0.0"]),
:runtime)]
s.description = "Ruby on Rails is a full-stack web framework optimized
Marshal.load Gem.inflate File.read 'rails-3.0.1.gemspec.rz'
25. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
$sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
deps with explicit
version requirement =>
require full spec list
26. http://moriz.de/Rubygems behind the gems.
SPECS AKA „THE INDEX“
$sudo gem install rails -V
GET http://gems.rubyforge.org/latest_specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/latest_specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/rails-3.0.1.gemspec.rz
200 OK
GET http://gems.rubyforge.org/specs.4.8.gz
302 Found
GET http://production.s3.rubygems.org/specs.4.8.gz
200 OK
GET http://gems.rubyforge.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
302 Found
GET http://production.s3.rubygems.org/quick/Marshal.4.8/activesupport-3.0.1.gemspec.rz
...
for each dependency
then download and install the .gem files
28. http://moriz.de/Rubygems behind the gems.
PROBLEMS: WHAT IF?
cli
rubygems.org
AWS S3
Temporary Outage:
no new gem releases
no gem downloads (index missing)
!
new app deployments?
new server deployments?
29. http://moriz.de/Rubygems behind the gems.
PROBLEMS: WHAT IF?
cli
rubygems.org
AWS S3
Fatal Outage, reasons:
• Hardware
• Software (attack, fs corruption)
• Amazon
• account „deactivation“
• account deletion
• S3 data loss
• S3 bucket account theft/crack
• Sunny day kills all the clouds.
• Jeff Bezos‘ new bicy^Segw^Rocket.
32. http://moriz.de/Rubygems behind the gems.
PROBLEMS: MIRRORING
Infrastructure independence to save your
business from a rubygems desaster:
> Start your own mirror
Fallback for rubygems.org desaster?
> Use a public mirror
> Start your own mirror
33. http://moriz.de/Rubygems behind the gems.
PROBLEMS: PUBLIC MIRRORS
Comprehensive Perl Archive Network
2010-10-25 online since 1995-10-26
7770 MB 228 mirrors
8463 authors 18582 modules
228 independent public and free mirrors!
35. http://moriz.de/Rubygems behind the gems.
PROBLEMS: PUBLIC MIRRORS
„The Python Package Index is a
repository of software for the Python
programming language.
There are currently 11801 packages here“
37. http://moriz.de/Rubygems behind the gems.
PROBLEMS: PUBLIC MIRRORS
0 active, public, free mirrors.
lost in migration (rubyforge > gemcutter)
38. http://moriz.de/Rubygems behind the gems.
PROBLEMS: MIRRORING
Mirroring stuff in rubygems is currently broken:
• „gem mirror“ misses some gems
& slow downloads: one gem at a time.
• index building is broken (see #362)
• reliability (#362, too)
http://help.rubygems.org/discussions/problems/362-cant-mirror-rubygems-
repo-incorrect-header-check
http://help.rubygems.org/discussions/problems/212-some-gems-and-specs-missing-that-are-in-the-index
Gemcutter already lost gems:
39. http://moriz.de/Rubygems behind the gems.
PROBLEMS: MIRRORING
There is also no easy way to mirror a S3 bucket:
• no ftp
• no rsync
• no file-list to use with e.g. wget
= you cannot even run a reliable private mirror :-(
40. http://moriz.de/Rubygems behind the gems.
SOLUTION
Provide rsync on master for sync-ability.
On EC2, Rackspace, does not matter if it‘s fast...
> NO custom mirroring software!
> most FOSS mirror sites use rsync
> use rsync, ask mirrors, problem solved.
> AWS cloudfront is NOT a solution
> not mirrorable, same vendor SPOFs.
41. http://moriz.de/Rubygems behind the gems.
SOLUTION
Provide rsync on master for sync-ability.
On EC2, Rackspace, does not matter if it‘s fast...
Provide a DNS based distribution (GeoDNS)
> a realiable base for (private) mirroring
> speed & latency improvements
> NO custom mirroring software needed!
> saves money (AWS and Rackspace fees)
> make use of the new mirrors!
43. http://moriz.de/Rubygems behind the gems.
SOLUTION
Why not?
Rubygems CLI could fallback to the rubygems.org
master if a gem version is not on the used mirror.
It already does if you configure it.
(current downside: d/l spec-lists from master everytime, looks fixable to me)
> no „instant deploy“ (real-time mirroring)
> no download stats
44. http://moriz.de/Rubygems behind the gems.
THINGS WILL FAIL...
just make sure you‘ve a working plan B
AND:
KISS & YAGNI. Keep it simple.
less moving parts > less things that will break.
Don‘t over-engineer.
45. http://moriz.de/Rubygems behind the gems.
HELP
OpenSource projects need your support.
Gemcutter/Rubygems, too.
Go contribute if you care about your ruby business.
The Gemcutter source is really awesome,
a good read for every developer.