Garbage Collection and The Ruby Heap
Garbage Collection and The Ruby Heap
Ruby Heap
Joe Damato and Aman Gupta
@joedamato @tmm1
About Joe Damato
CMU/VMWare alum
memprof, ltrace-libdl, performance
improvements to REE
http://timetobleed.com
@joedamato
About Aman Gupta
San Francisco, CA
Ruby Hero 2009
EventMachine, amqp, REE, sinbook,
perftools.rb, gdb.rb
github.com/tmm1
@tmm1
Why Garbage Collection?
We use Ruby because it’s simple and elegant
the GC is designed to make your life easier
how is it easier? no more:
memory management
memory leaks
func1()
4 bytes void *data;
func2();
func2()
4 bytes char *string = func3();
free(string);
func1()
4 bytes void *data;
func2();
char *func3()
char buffer[8];
12 bytes char *string = malloc(10);
return string;
func2()
4 bytes char *string = func3();
free(string);
func1()
4 bytes void *data;
func2();
char *func3()
char buffer[8];
12 bytes char *string = malloc(10); 10 bytes
return string;
func2()
4 bytes char *string = func3();
free(string);
func1()
4 bytes void *data;
func2();
10 bytes
func2()
4 bytes char *string = func3();
free(string);
func1()
4 bytes void *data;
func2();
func1()
4 bytes void *data;
func2();
if (during_gc)
rb_bug("allocation during GC");
if (during_gc)
add_freelist frees an
rb_bug("allocation during GC"); existing object
if (!freelist)
garbage_collect();
obj = (VALUE)freelist;
freelist = freelist->as.free.next;
MEMZERO((void*)obj, RVALUE, 1);
return obj; static inline void
} add_freelist(p)
RVALUE *p;
{
p->as.free.flags = 0;
p->as.free.next = freelist;
add object to top of freelist freelist = p;
}
The Ruby heap
The Ruby heap sort of resembles a slab allocator
Ruby allocates a slab by calling malloc
This space is carved up into fixed size slots for holding
Ruby objects
You can get an unused object from the Ruby heap by
calling rb_newobj
If there are no objects available, GC is run
If there are still no objects available, another slab is
created
Heaps on top of heaps
The Freelist
rb_newobj() tries to pull a free
slot off the freelist
GC finds non-reachable
objects and adds them to the
freelist
GC finds non-reachable
objects and adds them to the
freelist
GC finds non-reachable
objects and adds them to the
freelist
the freelist is a linked list all the slots on the new heap
across slots on the ruby are added to the freelist
heap
but what’s inside these slots...?
typedef struct RVALUE {
are RVALUEs
struct RVALUE *next;
} free;
struct RBasic basic;
struct RObject object;
can be one of many different types of struct RClass klass;
struct RFloat flonum;
ruby objects (uses a C union) struct RString string;
struct RArray array;
union is called as, so you can do struct RRegexp regexp;
struct RHash hash;
obj->as.string struct RData data;
struct RStruct rstruct;
union contains free section for struct RBignum bignum;
unused slots struct RFile file;
struct RNode node;
obj->free.next points to the next struct RMatch match;
struct RVarmap varmap;
free slot for the freelist struct SCOPE scope;
} as;
} RVALUE;
RBasic is a basic ruby object
#define T_NONE 0x00
struct RBasic { #define T_NIL 0x01
unsigned long flags; #define
#define
T_OBJECT
T_CLASS
0x02
0x03
VALUE klass; #define
#define
T_ICLASS
T_MODULE
0x04
0x05
}; #define
#define
T_FLOAT
T_STRING
0x06
0x07
#define T_REGEXP 0x08
#define T_ARRAY 0x09
#define T_FIXNUM 0x0a
all objects have flags #define T_HASH 0x0b
#define T_STRUCT 0x0c
#define T_BIGNUM 0x0d
flags == 0 means unused #define T_FILE 0x0e
MRI uses a
Finding Garbage
conservative
Finding Garbage
stop the world
Finding Garbage
mark and sweep
Finding Garbage
garbage collector.
conservative
MRI has a conservative GC
Raw pointers are handed to C
extensions
When scanning the Ruby
process stack it must assume
that anything that looks like a
pointer to a Ruby object is a
pointer to a Ruby object
stop the world
We use:
RUBY_GC_MALLOC_LIMIT=60000000
RUBY_HEAP_MIN_SLOTS=500000
RUBY_HEAP_SLOTS_GROWTH_FACTOR=1
RUBY_HEAP_SLOTS_INCREMENT=1
malloc_limit = 60MB
force garbage collection after
void * every N bytes worth of calls
ruby_xmalloc(size)
long size; to malloc or realloc
{
void *mem; defaults to 8MB
if (malloced > malloc_limit)
garbage_collect(); high traffic ruby servers can
easily allocate and free more
mem = malloc(size);
malloced += size; than 8mb in a single request
useful, but you can’t tell where the objects came from...
bleak_house
191691 total objects Final heap size 191691 filled, 220961 free
Displaying top 20 most common line/class pairs
89513 __null__:__null__:__node__
41438 __null__:__null__:String
2348 site_ruby/1.8/rubygems/specification.rb:557:Array
1508 gems/1.8/specifications/gettext-1.9.gemspec:14:String
http://github.com/fauna/bleak_house
installs a custom patched ruby
enables GC_DEBUG to track file/line in rb_newobj
increases size of RVALUE slots by 16 bytes
better than gdb.rb- you can see where the leaking
object was allocated
but, can’t run it in production without overhead
memprof
git://github.com/ice799/memprof.git
http://timetobleed.com/string-together-global-offset-tables-to-build-a-ruby-memory-profiler/
http://timetobleed.com/memprof-a-ruby-level-memory-profiler/
http://timetobleed.com/what-is-a-ruby-object-introducing-memprof-dump/
http://timetobleed.com/hot-patching-inlined-functions-with-x86_64-asm-metaprogramming/
http://timetobleed.com/rewrite-your-ruby-vm-at-runtime-to-hot-patch-useful-features/
plugging a leak in rails3
in dev mode, rails3 is leaking 10mb per request
connect to mongo
$ monogo localhost/memprof
nope!
plugging a leak in rails3
find one of the leaked controllers
> db.rails.findOne
({type:"class",name:"AccountsController"})._id
0x3b56780
{"_id":"0x4a8e6d0","file":"actionpack-3.0.0.beta/lib/
abstract_controller/localized_cache.rb","line":
3,"type":"hash","length":21}
{"_id":"0x4c78540","file":"actionpack-3.0.0.beta/lib/
action_controller/metal.rb","line":
74,"type":"hash","length":21}
{"_id":"0x29be3b0","type":"hash","length":21}
plugging a leak in rails3
first two are leaks!
module ActionController
class Metal < AbstractController::Base
class ActionEndpoint
@@endpoints = Hash.new {|h,k| h[k] = Hash.new {|sh,sk| sh[sk] = {} } }
module AbstractController
class HashKey
@hash_keys = Hash.new {|h,k| h[k] = Hash.new {|sh,sk| sh[sk] = {} } }
module ActionView
module Partials
class PartialRenderer
PARTIAL_NAMES = Hash.new {|h,k| h[k] = {} }
memprof
still a long and manual process, but memprof provides
all the data to make debugging memory issues
possible
coming soon: memprof.com
a web-based heap visualizer and leak analyzer
as a user, you simply:
gem install memprof
memprof MY_RAILS_APP_PID
visit http://memprof.com/c4e4d3eb0e18
see line numbers where your app is leaking
Questions?