9. require "csv"
data = CSV.open("data.csv")
output = data.readlines.map do |line|
line.map do |col|
col.downcase.gsub(/b('?[a-z])/) { $1.capitalize } }
end
end
File.open("output.csv", "w+") do |f|
f.write output.join("n")
end
Unoptimized Program
11. require "csv"
output = File.open("output.csv", "w+")
CSV.open("examples/data.csv", "r").each do |line|
output.puts line.map do |col|
col.downcase!
col.gsub!(/b('?[a-z])/) { $1.capitalize! }
end.join(",")
end
Memory Optimized Program
12. Ruby 2.1 Is NOT Faster
...once your program is memory optimized
Ruby 1.9 & 2.0
Ruby 2.1
0 2 4 6 8 10 12 14
13. Takeaways
1. Ruby 2.1 is not a silver performance bullet
2. Memory optimized Ruby app performs the same in 1.9, 2.0 and 2.1
3. Ruby 2.1 merely makes performance adequate by default
4. Optimize memory to make a difference
15. 5 Memory Optimization Strategies
1. Tune garbage collector
2. Do not allow Ruby instance to grow
3. Control GC manually
4. Write less Ruby
5. Avoid memory-intensive Ruby and Rails features
17. Ruby GC Tuning Goal
Goal: balance the number of GC runs and peak memory usage
How to check:
> GC.stat[:minor_gc_count]
> GC.stat[:major_gc_count]
> `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024 #MB
18. When Is Ruby GC Triggered?
Minor GC (faster, only new objects collected):
- not enough space on the Ruby heap to allocate new objects
- every 16MB-32MB of memory allocated in new objects
Major GC (slower, all objects collected):
- number of old or shady objects increases more than 2x
- every 16MB-128MB of memory allocated in old objects
19. Environment Variables
Initial number of slots on the heap RUBY_GC_HEAP_INIT_SLOTS 1000
Min number of slots that GC must free RUBY_GC_HEAP_FREE_SLOTS 4096
Heap growth factor RUBY_GC_HEAP_GROWTH_FACTOR 1.8
Maximum heap slots to add RUBY_GC_HEAP_GROWTH_MAX_SLOTS -
New generation malloc limit RUBY_GC_MALLOC_LIMIT 16M
Maximum new generation malloc limit RUBY_GC_MALLOC_LIMIT_MAX 32M
New generation malloc growth factor RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR 1.4
Old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT 16M
Maximum old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT_MAX 128M
Old generation malloc growth factor RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR 1.2
20. When Is Ruby GC Triggered?
ruby-performance-book.com
http://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gc
http://thorstenball.com/blog/2014/03/12/watching-understanding-ruby-2.1-garbage-collector/
28. GC Between Requests in Unicorn
OobGC for Ruby < 2.1
require 'unicorn/oob_gc'
use(Unicorn::OobGC, 1)
gctools for Ruby >= 2.1 https://github.com/tmm1/gctools
require 'gctools/oobgc'
use(GC::OOB::UnicornMiddleware)
29. GC Between Requests in Unicorn
Things to have in mind:
- make sure you have enough workers
- make sure CPU utilization < 50%
- this improves only “perceived” performance
- overall performance might be worse
- only effective for memory-intensive applications
35. Operations That Copy Data
● String::gsub! instead of String::gsub and similar
● String::<< instead of String::+=
● File::readline or File::each instead of File::readlines or File.read
● CSV::parseline instead of CSV::parse
36. ActiveRecord Also Copies Data
● ActiveRecord::Base::update_all
Book.where('title LIKE ?', '%Rails%').
order(:created_at).limit(5).
update_all(author: 'David')
● Direct manipulation over query result
result = ActiveRecord::Base.execute 'select * from books'
result.each do |row|
# do something with row.values_at('col1', 'col2')
end
37. Rails Serializers Copy Too Much
class Smth < ActiveRecord::Base
serialize :data, JSON
end
class Smth < ActiveRecord::Base
def data
JSON.parse(read_attribute(:data))
end
def data=(value)
write_attribute(:data, value.to_json)
end
end
42. RubyProf Memory Profiling
require 'ruby-prof'
RubyProf.measure_mode = RubyProf::MEMORY
RubyProf.start
str = 'x'*1024*1024*10
result = RubyProf.stop
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)
This requires patched Ruby, will work only for 1.8 and 1.9
https://github.com/ruby-prof/ruby-prof/issues/86
Ok, Let&apos;s talk about performance
Can I have a show of hands. Who here thinks Ruby is fast:
C&apos;mon, only a few people – I disagree, Ruby is fast, especially the latest version except for one thing – memory consumption and garbage collection make it slow.
Oh, most people here think it&apos;s fast – I do agree, ruby is fast until your program takes so much memory that it becomes slow.
Why am I talking so much about memory? Here&apos;s why.
Why? Two reasons:
Large memory overhead where every object takes at least 40 bytes in memory
Plus
Slow gc algorithm that got improved in 2.1 but not as much as we will later see
That all equals not universal love an peace
But high memory consumption and because of that enormous time that app spends doing GC
That is why memory optimization is so important. It saves you that GC time
That&apos;s also why Ruby 2.1 is so important. It makes GC so much faster.
Some examples from my own experience.
Here&apos;s another example. No memory optimization done, but Ruby upgraded from 1.9 to 2.1
But here&apos;s another thing. If you can upgrade – fine. If not - you can still get same and better performance by optimizing memory.
How does tuning help?
You can balance...
By default this balance is to do more GC and reduce memory peaks. You can shift this balance.
Change GC settings and see how often GC is called and what your memory usage is
Let&apos;s step back for a minute and look when GC is triggered
There has been a sentiment inside Rails community that sql is somehow bad, that you should avoid it at all costs. People invent more and more things to stay out of sql. Just to mention AREL.
Guys, I wholeheartedly disagree with this. Web frameworks come and go. Sql stays. We had sql for 40 years. It&apos;s not going away.
So, our time is out. If you&apos;d like to learn more about ruby performance optimization, please sign up for my book mailing list updates. If you need help, just email me or airpair with me. And thank you for listening.