Classes and Objects
Classes and objects are obviously central to Ruby, but at first sight
they can seem a little confusing. There seem to be a lot of concepts:
classes, objects, class objects, instance methods, class methods, and
singleton classes. In reality, however, Ruby has just a single
underlying class and object structure, which we'll discuss in this
chapter. In fact, the basic model is so simple, we can describe it in
a single paragraph.
A Ruby object has three components: a set of flags,
some instance variables, and an associated class. A Ruby class is an
object of class
Class
, which contains all the object things plus a
list of methods and a reference to a superclass (which is itself
another class). All method calls in Ruby nominate a receiver (which is
by default
self
, the current object).
Ruby finds the method to
invoke by looking at the list of methods in the receiver's class. If
it doesn't find the method there, it looks in the superclass, and then
in the superclass's
superclass, and so on. If the method cannot be found in the receiver's
class or any of its ancestors, Ruby invokes the method
method_missing
on the original receiver.
And that's it---the entire explanation. On to the next chapter.
``But wait,'' you cry, ``I spent good money on this chapter. What
about all this other stuff---singleton classes, class methods, and so
on. How do they work?''
Good question.
All class/object interactions are explained using the simple model
given above: objects reference classes, and
classes reference zero or more superclasses. However, the
implementation details can get a tad tricky.
We've found that the simplest way of visualizing all this is to draw
the actual objects that Ruby implements. So, in the following pages
we'll look at all the possible combinations of classes and
objects. Note that these are not class diagrams in the UML sense;
we're showing structures in memory and pointers between them.
Let's start by looking at an object created from a simple
class. Figure 19.1 on page 239 shows
an object referenced by a variable,
lucille
, the object's class,
Guitar
, and that class's superclass,
Object
. Notice how the
object's class reference (called
klass
for historical reasons
that really bug Andy) points to the class object, and how the
super
pointer from that class references the parent class.
When Ruby executes
Guitar.strings()
, it follows the same process
as before: it goes to the receiver, class
Guitar
, follows the
klass
reference to class
Guitar$'$
, and finds the method.
Finally, note that an ``S'' has crept into the flags in class
Guitar$'$
. The classes that Ruby creates automatically are
marked internally as
singleton classes.
Singleton classes are
treated slightly differently within Ruby. The most obvious difference
from the outside is that they are effectively invisible: they will
never appear in a list of objects returned from methods such as
Module#ancestors
or
ObjectSpace::each_object
.
Ruby allows you to create a class tied to a particular object. In the
following example, we create two
String
objects. We then associate
an anonymous class with one of them,
overriding one of the methods in
the object's base class and adding a new method.
a = "hello"
|
b = a.dup
|
|
class <<a
|
def to_s
|
"The value is '#{self}'"
|
end
|
def twoTimes
|
self + self
|
end
|
end
|
|
a.to_s
|
» |
"The value is 'hello'"
|
a.twoTimes
|
» |
"hellohello"
|
b.to_s
|
» |
"hello"
|
This example uses the ``
class <<
obj'' notation, which
basically says ``build me a new class just for object
obj.'' We
could also have written it as:
a = "hello"
|
b = a.dup
|
def a.to_s
|
"The value is '#{self}'"
|
end
|
def a.twoTimes
|
self + self
|
end
|
|
a.to_s
|
» |
"The value is 'hello'"
|
a.twoTimes
|
» |
"hellohello"
|
b.to_s
|
» |
"hello"
|
The effect is the same in both cases: a class is added to the object
``
a
''. This gives us a strong hint about the Ruby implementation:
a singleton class is created and inserted as
a
's direct
class.
a
's original class,
String
, is made this singleton's
superclass. The before and after pictures are shown in Figure
19.3 on page 242.
Ruby performs a slight optimization with these singleton classes. If
an object's
klass
reference already points to a singleton class,
a new one will not be created. This means that the first of the two
method definitions in the previous example will create a singleton
class, but the second will simply add a method to it.
When a class includes a module, that module's instance methods become
available as instance methods of the class. It's almost as if the
module becomes a superclass of the class that uses it. Not
surprisingly, that's about how it works. When you include a module,
Ruby creates an anonymous proxy class that references that module, and
inserts that proxy as the direct superclass of the class that did the
including. The proxy class contains references to the instance
variables and methods of the module. This is important: the same
module may be included in many different classes, and will appear in
many different inheritance chains. However, thanks to the proxy
class,
there is still only one underlying module: change a method definition
in that module, and it will change in all classes that include that
module, both past and future.
module SillyModule
|
def hello
|
"Hello."
|
end
|
end
|
class SillyClass
|
include SillyModule
|
end
|
s = SillyClass.new
|
s.hello
|
» |
"Hello."
|
module SillyModule
|
def hello
|
"Hi, there!"
|
end
|
end
|
s.hello
|
» |
"Hi, there!"
|
The relationship between classes and the modules they include is shown
in Figure 19.4 on page 243. If multiple modules are included, they
are added to the chain in order.
If a module itself includes other modules, a chain of proxy classes
will be added to any class that includes that module, one proxy for
each module that is directly or indirectly included.
Just as you can define an anonymous class for an
object using ``
class <<obj
'', you can mix a module into an
object using
Object#extend
. For example:
module Humor
|
def tickle
|
"hee, hee!"
|
end
|
end
|
|
a = "Grouchy"
|
a.extend Humor
|
a.tickle
|
» |
"hee, hee!"
|
There is an interesting trick with
extend
.
If you use it
within a class definition, the module's methods become class
methods.
module Humor
|
def tickle
|
"hee, hee!"
|
end
|
end
|
|
class Grouchy
|
include Humor
|
extend Humor
|
end
|
|
Grouchy.tickle
|
» |
"hee, hee!"
|
a = Grouchy.new
|
a.tickle
|
» |
"hee, hee!"
|
This is because calling
extend
is equivalent to
self.extend
, so the methods are added to
self
, which in a
class definition is the class itself.
Having exhausted the combinations of classes and objects, we can
(thankfully) get back to programming by looking at the nuts and bolts
of class and module definitions.
In languages such as C++ and Java, class definitions are processed at
compile time: the compiler loads up symbol tables, works out how much
storage to allocate, constructs dispatch tables, and does all those other
obscure things we'd rather not think too hard about.
Ruby is different. In Ruby, class and module definitions are executable
code. Although parsed at compile time, the classes and modules are
created at runtime, when the definition is encountered. (The same is
also true of method definitions.) This allows you to structure your
programs far more dynamically than in most conventional languages.
You can make decisions once, when the class is being
defined, rather than each time that objects of the class are
used. The class in the following example decides as it is being
defined what version of a decryption routine to create.
class MediaPlayer
include Tracing if $DEBUGGING
if ::EXPORT_VERSION
def decrypt(stream)
raise "Decryption not available"
end
else
def decrypt(stream)
# ...
end
end
end
|
If class definitions are executable code, this implies that they
execute in the context of some object:
self
must reference
something. Let's find out what it is.
class Test
puts "Type of self = #{self.type}"
puts "Name of self = #{self.name}"
end
|
produces:
Type of self = Class
Name of self = Test
|
This means that a class definition is executed with that class as the
current object. Referring back to the section about metaclasses
on page 238, we can see that this means that methods in
the metaclass and its superclasses will be available during the
execution of the method definition. We can check this out.
class Test
def Test.sayHello
puts "Hello from #{name}"
end
sayHello
end
|
produces:
In this example we define a class method,
Test.sayHello
, and
then call it in the body of the class definition. Within
sayHello
, we call
name
, an instance method of class
Module
. Because
Module
is an ancestor of
Class
, its instance
methods can be called without an explicit receiver within a class definition.
In fact, many of the directives that you use when defining a class
or module, things such as
alias_method
,
attr
, and
public
, are simply methods in class
Module
. This opens up
some interesting possibilities---you can extend the functionality of
class and module definitions by writing Ruby code. Let's look at a
couple of examples.
As a first example, let's look at adding a basic documentation
facility to modules and classes.
This would allow us to associate a
string with modules and classes that we write, a string that is
accessible as the program is running. We'll choose a simple syntax.
class Example
doc "This is a sample documentation string"
# .. rest of class
end
|
We need to make
doc
available to any module or class, so we
need to make it an instance method of class
Module
.
class Module
@@docs = Hash.new(nil)
def doc(str)
@@docs[self.name] = str
end
def Module::doc(aClass)
# If we're passed a class or module, convert to string
# ('<=' for classes checks for same class or subtype)
aClass = aClass.name if aClass.type <= Module
@@docs[aClass] || "No documentation for #{aClass}"
end
end
class Example
doc "This is a sample documentation string"
# .. rest of class
end
module Another
doc <<-edoc
And this is a documentation string
in a module
edoc
# rest of module
end
puts Module::doc(Example)
puts Module::doc("Another")
|
produces:
This is a sample documentation string
And this is a documentation string
in a module
|
The second example is a performance enhancement based on Tadayoshi Funaba's
date
module
(described beginning on page 439). Say we
have a class that represents some underlying quantity (in this case, a
date). The class may have many attributes that present the same
underlying date in different ways: as a Julian day number, as a
string, as a [year, month, day] triple, and so on. Each value
represents the same date and may involve a fairly complex calculation
to derive. We therefore would like to calculate each attribute only
once, when it is first accessed.
The manual way would be to add a test to each accessor:
class ExampleDate
def initialize(dayNumber)
@dayNumber = dayNumber
end
def asDayNumber
@dayNumber
end
def asString
unless @string
# complex calculation
@string = result
end
@string
end
def asYMD
unless @ymd
# another calculation
@ymd = [ y, m, d ]
end
@ymd
end
# ...
end
|
This is a clunky technique---let's see if we can come up with
something sexier.
What we're aiming for is a directive that indicates that the body of a
particular method should be invoked only once. The value returned by
that first call should be cached. Thereafter, calling that same method
should return the cached value without reevaluating the method body
again. This is similar to Eiffel's
once
modifier for routines.
We'd like to be able to write something like:
class ExampleDate
def asDayNumber
@dayNumber
end
def asString
# complex calculation
end
def asYMD
# another calculation
[ y, m, d ]
end
once :asString, :asYMD
end
|
We can use
once
as a directive by writing it as a class method of
ExampleDate
,
but what should it look like internally? The trick
is to have it rewrite the methods whose names it is passed. For each
method, it creates an alias for the original code, then creates a new
method with the same name. This new method does two things. First, it
invokes the original method (using the alias) and stores the resulting
value in an instance variable. Second, it redefines itself, so that on
subsequent calls it simply returns the value of the instance variable
directly. Here's Tadayoshi Funaba's code, slightly reformatted.
def ExampleDate.once(*ids)
for id in ids
module_eval <<-"end_eval"
alias_method :__#{id.to_i}__, #{id.inspect}
def #{id.id2name}(*args, &block)
def self.#{id.id2name}(*args, &block)
@__#{id.to_i}__
end
@__#{id.to_i}__ = __#{id.to_i}__(*args, &block)
end
end_eval
end
end
|
This code uses
module_eval
to execute a block of code in the context of
the calling module (or, in this case, the calling class). The original
method is renamed
__nnn__, where the
nnn part is
the integer representation of the method name's symbol id. The code
uses the same name for the caching instance variable. The bulk
of the code is a method that dynamically redefines itself. Note that
this redefinition uses the fact that methods may contain nested
singleton method definitions, a clever trick.
Understand this code, and you'll be well on the way to true Ruby mastery.
However, we can take it further. Look in the
date
module, and you'll
see method
once
written slightly differently.
class Date
class << self
def once(*ids)
# ...
end
end
# ...
end
|
The interesting thing here is the inner class definition,
``
class << self
''. This defines a class based on the object
self
, and
self
happens to be the class object for
Date
. The result? Every method within the inner class definition
is automatically a class method of
Date
.
The
once
feature is generally
applicable---it should work for any class. If you took
once
and made it a private instance method of class
Module
, it would be
available for use in any Ruby class.
We've said that when you invoke a class method, all you're doing is
sending a message to the
Class
object itself. When you say
something such as
String.new("gumby")
, you're sending the message
new
to the object that is class
String
. But how does Ruby
know to do this? After all, the receiver of a message should be an
object reference, which implies that there must be a
constant called ``String'' somewhere containing a reference to the
String
object.
[It will be a constant, not a variable,
because ``String'' starts with an uppercase letter.]
And in fact, that's exactly what happens. All the built-in classes,
along with the classes you define, have a corresponding global
constant with the same name as the class.
This is both straightforward and subtle. The subtlety comes from the
fact that there are actually two things named (for example)
String
in the
system. There's a
constant that references an object of class
String
, and there's the object itself.
The fact that class names are just constants means that you can treat
classes just like any other Ruby object: you can copy them, pass them
to methods, and use them in expressions.
def factory(klass, *args)
|
klass.new(*args)
|
end
|
|
factory(String, "Hello")
|
» |
"Hello"
|
factory(Dir, ".")
|
» |
#<Dir:0x401b51bc>
|
|
flag = true
|
(flag ? Array : Hash)[1, 2, 3, 4]
|
» |
[1, 2, 3, 4]
|
flag = false
|
(flag ? Array : Hash)[1, 2, 3, 4]
|
» |
{1=>2, 3=>4}
|
Many times in this book we've claimed that everything in Ruby is an
object. However, there's one thing that we've used time and time again
that appears to contradict this---the top-level Ruby execution environment.
Not an object in sight. We may as well be writing some variant of
Fortran or QW-Basic. But dig deeper, and you'll come across objects
and classes lurking in even the simplest code.
We know that the literal
"Hello, World"
generates a Ruby
String
, so there's one object. We also know that the bare method
call to
puts
is effectively the same as
self.puts
. But
what is ``self''?
At the top level, we're executing code in the context of some
predefined object. When we define methods, we're actually creating
(private) singleton methods for this object. Instance variables belong
to this object. And because we're in the context of
Object
, we can
use all of
Object
's methods (including those mixed-in from
Kernel
)
in function form. This explains why we can call
Kernel
methods
such as
puts
at the top level (and indeed throughout Ruby):
these methods are part of every object.
There's one last wrinkle to class inheritance, and it's fairly
obscure.
Within a class definition, you can change the visibility of a method
in an ancestor class. For example, you can do something like:
class Base
def aMethod
puts "Got here"
end
private :aMethod
end
class Derived1 < Base
public :aMethod
end
class Derived2 < Base
end
|
In this example, you would be able to invoke
aMethod
in
instances of class
Derived1
, but not via instances of
Base
or
Derived2
.
So how does Ruby pull off this feat of having one method
with two different visibilities? Simply put, it cheats.
If a subclass changes the visibility of a method in a parent, Ruby
effectively inserts a hidden proxy method in the subclass that invokes
the original method using
super
. It then sets the visibility
of that proxy to whatever you requested. This means that the code:
class Derived1 < Base
public :aMethod
end
|
is effectively the same as:
class Derived1 < Base
def aMethod(*args)
super
end
public :aMethod
end
|
The call to
super
can access the parent's method regardless of
its visibility, so the rewrite allows the subclass to override its
parent's visibility rules. Pretty scary, eh?
There are times when you've worked hard to make your object exactly
right, and you'll be damned if you'll let anyone just change
it. Perhaps you need to pass some kind of opaque object between two of
your classes via some third-party object, and you want to make sure it
arrives unmodified. Perhaps you want to use an object as a hash key, and
need to make sure that no one modifies it while it's being used.
Perhaps something is corrupting one of your objects, and you'd like
Ruby to raise an exception as soon as the change occurs.
Ruby provides a very simple mechanism to help with this. Any object can be
frozen by invoking
Object#freeze
. A frozen object may
not be modified: you can't change its instance variables (directly or
indirectly), you can't associate singleton methods with it, and, if it
is a class or module, you can't add, delete, or modify its
methods. Once frozen, an object stays frozen: there is no
Object#thaw
. You can test to see if an object is frozen using
Object#frozen?
.
What happens when you copy a frozen object? That depends on the method
you use. If you call an object's
clone
method, the entire
object state (including whether it is frozen) is copied to the new
object. On the other hand,
dup
typically copies only the
object's contents---the new copy will not inherit the frozen status.
str1 = "hello"
|
str1.freeze
|
» |
"hello"
|
str1.frozen?
|
» |
true
|
str2 = str1.clone
|
str2.frozen?
|
» |
true
|
str3 = str1.dup
|
str3.frozen?
|
» |
false
|
Although freezing objects may initially seem like a good idea, you
might want to hold off doing it until you come across a real
need. Freezing is one of those ideas that looks essential on paper but
isn't used much in practice.
Extracted from the book "Programming Ruby -
The Pragmatic Programmer's Guide"
Copyright
©
2001 by Addison Wesley Longman, Inc. This material may
be distributed only subject to the terms and conditions set forth in
the Open Publication License, v1.0 or later (the latest version is
presently available at
http://www.opencontent.org/openpub/)).
Distribution of substantively modified versions of this document is
prohibited without the explicit permission of the copyright holder.
Distribution of the work or derivative of the work in any standard
(paper) book form is prohibited unless prior permission is obtained
from the copyright holder.