4. debugging ruby?
⢠i use ruby
⢠my ruby processes
use a lot of ram
⢠i want to ďŹx this
5. letâs build a debugger
⢠step 1: collect data
⢠list of all ruby
objects in memory
6. letâs build a debugger
⢠step 1: collect data
⢠list of all ruby
objects in memory
⢠step 2: analyze data
⢠group by type
⢠group by ďŹle/line
7. version 1: collect data
⢠simple patch to ruby VM (300 lines of C)
⢠http://gist.github.com/73674
⢠simple text based output format
0x154750 @ -e:1 is OBJECT of type: T
0x15476c @ -e:1 is HASH which has data
0x154788 @ -e:1 is ARRAY of len: 0
0x1547c0 @ -e:1 is STRING (SHARED) len: 2 and val: hi
0x1547dc @ -e:1 is STRING len: 1 and val: T
0x154814 @ -e:1 is CLASS named: T inherits from Object
0x154a98 @ -e:1 is STRING len: 2 and val: hi
0x154b40 @ -e:1 is OBJECT of type: Range
14. version 1
⢠it works!
⢠but...
⢠must patch and rebuild ruby binary
15. version 1
⢠it works!
⢠but...
⢠must patch and rebuild ruby binary
⢠no information about references between
objects
16. version 1
⢠it works!
⢠but...
⢠must patch and rebuild ruby binary
⢠no information about references between
objects
⢠limited analysis via shell scripting
19. version 2 goals
⢠better data format
⢠simple: one line of text per object
20. version 2 goals
⢠better data format
⢠simple: one line of text per object
⢠expressive: include all details about
object contents and references
21. version 2 goals
⢠better data format
⢠simple: one line of text per object
⢠expressive: include all details about
object contents and references
⢠easy to use: easy to generate from C
code & easy to consume from various
scripting languages
24. version 2 is memprof
⢠no patches to ruby necessary
⢠gem install memprof
⢠require âmemprofâ
⢠Memprof.dump_all(â/tmp/app.jsonâ)
25. version 2 is memprof
⢠no patches to ruby necessary
⢠gem install memprof
⢠require âmemprofâ
⢠Memprof.dump_all(â/tmp/app.jsonâ)
⢠C extension for MRI ruby VM
http://github.com/ice799/memprof
⢠uses libyajl to dump out all ruby objects
as json
28. Memprof.dump{
strings }
âhelloâ + âworldâ
{
"_id": "0x19c610", memory address of object
"file": "-e", ďŹle and line where string
"line": 1,
was created
"type": "string",
"class": "0x1ba7f0",
"class_name": "String",
"length": 10,
"data": "helloworld"
}
29. Memprof.dump{
strings }
âhelloâ + âworldâ
{
"_id": "0x19c610", memory address of object
"file": "-e", ďŹle and line where string
"line": 1,
was created
"type": "string",
"class": "0x1ba7f0", address of the class object
"class_name": "String", âStringâ
"length": 10,
"data": "helloworld"
}
30. Memprof.dump{
strings }
âhelloâ + âworldâ
{
"_id": "0x19c610", memory address of object
"file": "-e", ďŹle and line where string
"line": 1,
was created
"type": "string",
"class": "0x1ba7f0", address of the class object
"class_name": "String", âStringâ
"length": 10, length and contents
"data": "helloworld" of this string instance
}
48. built on...
$ mongoimport
-d memprof
-c rails
--file /tmp/app.json
$ mongo memprof
letâs run some queries.
49. how many objects?
> db.rails.count()
809816
⢠ruby scripts create a lot of objects
⢠usually not a problem, but...
⢠MRI has a naïve stop-the-world mark/
sweep GC
⢠fewer objects = faster GC = better
performance
50. what types of objects?
> db.rails.distinct(âtypeâ)
[âarrayâ,
âbignumâ,
âclassâ,
âfloatâ,
âhashâ,
âmoduleâ,
ânodeâ,
âobjectâ,
âregexpâ,
âstringâ,
...]
53. mongodb: distinct
⢠distinct(âtypeâ)
list of types of objects
⢠distinct(âfileâ)
list of source ďŹles
54. mongodb: distinct
⢠distinct(âtypeâ)
list of types of objects
⢠distinct(âfileâ)
list of source ďŹles
⢠distinct(âclass_nameâ)
list of instance class names
55. mongodb: distinct
⢠distinct(âtypeâ)
list of types of objects
⢠distinct(âfileâ)
list of source ďŹles
⢠distinct(âclass_nameâ)
list of instance class names
⢠optionally ďŹlter ďŹrst
⢠distinct(ânameâ, {type:âclassâ})
names of all deďŹned classes
58. mongodb: ensureIndex
⢠add an index on a ďŹeld (if it doesnât exist yet)
⢠improve performance of queries against
common ďŹelds: type, class_name, super, ďŹle
59. mongodb: ensureIndex
⢠add an index on a ďŹeld (if it doesnât exist yet)
⢠improve performance of queries against
common ďŹelds: type, class_name, super, ďŹle
⢠can index embedded ďŹeld names
⢠ensureIndex(âmethods.addâ)
⢠find({âmethods.addâ:{$exists:true}})
ďŹnd classes that deďŹne the method add
60. how many objs per type?
> db.rails.group({
initial: {count:0},
key: {type:true}, group on type
cond: {},
reduce: function(obj, out) {
out.count++
}
}).sort(function(a,b){
return a.count - b.count
})
61. how many objs per type?
> db.rails.group({
initial: {count:0},
key: {type:true}, group on type
cond: {},
reduce: function(obj, out) {
increment count
out.count++
for each obj
}
}).sort(function(a,b){
return a.count - b.count
})
62. how many objs per type?
> db.rails.group({
initial: {count:0},
key: {type:true}, group on type
cond: {},
reduce: function(obj, out) {
increment count
out.count++
for each obj
}
}).sort(function(a,b){
return a.count - b.count sort results
})
63. how many objs per type?
[
...,
{type: âarrayâ, count: 7621},
{type: âstringâ, count: 69139},
{type: ânodeâ, count: 365285}
]
64. how many objs per type?
[
...,
{type: âarrayâ, count: 7621},
{type: âstringâ, count: 69139},
{type: ânodeâ, count: 365285}
]
lots of nodes
65. how many objs per type?
[
...,
{type: âarrayâ, count: 7621},
{type: âstringâ, count: 69139},
{type: ânodeâ, count: 365285}
]
lots of nodes
⢠nodes represent ruby code
⢠stored like any other ruby object
⢠makes ruby completely dynamic
68. mongodb: group
⢠cond: query to ďŹlter objects before
grouping
⢠key: ďŹeld(s) to group on
69. mongodb: group
⢠cond: query to ďŹlter objects before
grouping
⢠key: ďŹeld(s) to group on
⢠initial: initial values for each groupâs
results
70. mongodb: group
⢠cond: query to ďŹlter objects before
grouping
⢠key: ďŹeld(s) to group on
⢠initial: initial values for each groupâs
results
⢠reduce: aggregation function
73. mongodb: group
⢠bykey: {type:1}
type or class
â˘
⢠key: {class_name:1}
⢠bykey:&{file:1, line:1}
ďŹle line
â˘
74. mongodb: group
⢠bykey: {type:1}
type or class
â˘
⢠key: {class_name:1}
⢠bykey:&{file:1, line:1}
ďŹle line
â˘
⢠bycond: in a speciďŹc ďŹle
type
⢠{file: âapp.rbâ},
key: {file:1, line:1}
75. mongodb: group
⢠bykey: {type:1}
type or class
â˘
⢠key: {class_name:1}
⢠bykey:&{file:1, line:1}
ďŹle line
â˘
⢠bycond: in a speciďŹc ďŹle
type
⢠{file: âapp.rbâ},
key: {file:1, line:1}
⢠bycond: {file:âapp.rbâ,type:âstringâ},
length of strings in a speciďŹc ďŹle
â˘
key: {length:1}
87. when were objs created?
⢠useful to look at objects over time
⢠each obj has a timestamp of when it was
created
88. when were objs created?
⢠useful to look at objects over time
⢠each obj has a timestamp of when it was
created
⢠ďŹnd minimum time, call it
start_time
89. when were objs created?
⢠useful to look at objects over time
⢠each obj has a timestamp of when it was
created
⢠ďŹnd minimum time, call it
start_time
⢠create buckets for every
minute of execution since
start
90. when were objs created?
⢠useful to look at objects over time
⢠each obj has a timestamp of when it was
created
⢠ďŹnd minimum time, call it
start_time
⢠create buckets for every
minute of execution since
start
⢠place objects into buckets
91. when were objs created?
> db.rails.mapReduce(function(){
var secs = this.time - start_time;
var mins_since_start = secs % 60;
emit(mins_since_start, 1);
}, function(key, vals){
for(var i=0,sum=0; i<vals.length;
sum += vals[i++]);
return sum;
}, {
scope: { start_time: db.rails.find
().sort({time:1}).limit(1)[0].time }
} start_time = min(time)
)
{result:"tmp.mr_1272615772_3"}
92. mongodb: mapReduce
⢠arguments
⢠map: function that emits one or more
key/value pairs given each object this
⢠reduce: function to return aggregate
result, given key and list of values
⢠scope: global variables to set for funcs
93. mongodb: mapReduce
⢠arguments
⢠map: function that emits one or more
key/value pairs given each object this
⢠reduce: function to return aggregate
result, given key and list of values
⢠scope: global variables to set for funcs
⢠results
⢠stored in a temporary collection
(tmp.mr_1272615772_3)
94. when were objs created?
> db.tmp.mr_1272615772_3.count()
12
script was running for 12 minutes
95. when were objs created?
> db.tmp.mr_1272615772_3.count()
12
script was running for 12 minutes
> db.tmp.mr_1272615772_3.find().sort
({value:-1}).limit(1)
{_id: 8, value: 41231}
41k objects created 8 minutes after start
96. references to this object?
ary = [âaâ,âbâ,âcâ]
ary references âaâ
âbâ referenced by ary
⢠ruby makes it easy to âleakâ references
⢠an object will stay around until all
references to it are gone
⢠more objects = longer GC = bad
performance
⢠must ďŹnd references to ďŹx leaks
97. references to this object?
⢠db.rails_refs.insert({
_id:"0xary", refs:["0xa","0xb","0xc"]
})
create references lookup table
98. references to this object?
⢠db.rails_refs.insert({
_id:"0xary", refs:["0xa","0xb","0xc"]
})
create references lookup table
⢠db.rails_refs.ensureIndex({refs:1})
add âmultikeyâ index to refs array
99. references to this object?
⢠db.rails_refs.insert({
_id:"0xary", refs:["0xa","0xb","0xc"]
})
create references lookup table
⢠db.rails_refs.ensureIndex({refs:1})
add âmultikeyâ index to refs array
⢠db.rails_refs.find({refs:â0xaâ})
efďŹciently lookup all objs holding a ref to 0xa
100. mongodb: multikeys
⢠indexes on array values create a âmultikeyâ
index
⢠classic example: nested array of tags
⢠find({tags: ârubyâ})
ďŹnd objs where obj.tags includes ârubyâ
111. plugging a leak in rails3
⢠in dev mode, rails3 is leaking 10mb per request
112. plugging a leak in rails3
⢠in dev mode, rails3 is leaking 10mb per request
letâs use memprof to ďŹnd it!
# in environment.rb
require `gem which memprof/signal`.strip
113. plugging a leak
in rails3
send the app some
requests so it leaks
$ ab -c 1 -n 30
http://localhost:3000/
114. plugging a leak
in rails3
send the app some
requests so it leaks
$ ab -c 1 -n 30
http://localhost:3000/
tell memprof to dump
out the entire heap to
json
$ memprof
--pid <pid>
--name <dump name>
--key <api key>
115. plugging a leak
in rails3
send the app some
requests so it leaks
$ ab -c 1 -n 30
http://localhost:3000/
tell memprof to dump
out the entire heap to
json
$ memprof
--pid <pid>
--name <dump name>
--key <api key>
126. ďŹnd references to object
âleakâ is on line 178
holding references
to all controllers
127. ⢠In development mode, Rails reloads all your
application code on every request
128. ⢠In development mode, Rails reloads all your
application code on every request
⢠ActionView::Partials::PartialRenderer is caching
partials used by each controller as an optimization
129. ⢠In development mode, Rails reloads all your
application code on every request
⢠ActionView::Partials::PartialRenderer is caching
partials used by each controller as an optimization
⢠But.. it ends up holding a reference to every single
reloaded version of those controllers
130. ⢠In development mode, Rails reloads all your
application code on every request
⢠ActionView::Partials::PartialRenderer is caching
partials used by each controller as an optimization
⢠But.. it ends up holding a reference to every single
reloaded version of those controllers