Classes and objects are obviously central to Ruby, but at first sight they can seem a little confusing. There seem to be a lot of concepts: classes, objects, class objects, instance methods, class methods, and singleton classes. In reality, however, Ruby has just a single underlying class and object structure, which we'll discuss in this chapter. In fact, the basic model is so simple, we can describe it in a single paragraph.
A Ruby object has three components: a set of flags, some instance variables, and an associated class. A Ruby class is an object of class Class
, which contains all the object things plus a list of methods and a reference to a superclass (which is itself another class). All method calls in Ruby nominate a receiver (which is by default self, the current object). Ruby finds the method to invoke by looking at the list of methods in the receiver's class. If it doesn't find the method there, it looks in the superclass, and then in the superclass's superclass, and so on. If the method cannot be found in the receiver's class or any of its ancestors, Ruby invokes the method method_missing
on the original receiver.
And that's it—the entire explanation. On to the next chapter.
“But wait,” you cry, “I spent good money on this chapter. What about all this other stuff—singleton classes, class methods, and so on. How do they work?”
Good question.
All class/object interactions are explained using the simple model given above: objects reference classes, and classes reference zero or more superclasses. However, the implementation details can get a tad tricky.
We've found that the simplest way of visualizing all this is to draw the actual objects that Ruby implements. So, in the following pages we'll look at all the possible combinations of classes and objects. Note that these are not class diagrams in the UML sense; we're showing structures in memory and pointers between them.
Let's start by looking at an object created from a simple class. Figure 19.1 shows an object referenced by a variable, lucille
, the object's class, Guitar
, and that class's superclass, Object
. Notice how the object's class reference (called klass
for historical reasons that really bug Andy) points to the class object, and how the super
pointer from that class references the parent class.
If we invoke the method lucille.play()
, Ruby goes to the receiver, lucille
, and follows the klass
reference to the class object for Guitar
. It searches the method table, finds play
, and invokes it.
If instead we call lucile.display()
, Ruby starts off the same way, but cannot find display
in the method table in the class Guitar
. It then follows the super
reference to Guitar
's superclass, Object
, where it find and executes the method.
Astute readers (yup, that's all of you) will have noticed that the klass
members of Class
objects point to nothing meaningful in Figure 19.1. We now have all the information we need to work out what they should reference.
When you say lucille.play()
, Ruby follows lucille
's klass
pointer to find a class object in which to search for methods. So what happens when you invoke a class method, such as Guitar.strings(...)
? Here the receiver is the class object itself, Guitar
. So, to be consistent, we need to stick the methods in some other class, referenced from Guitar
's klass
pointer. This new class will contain all of Guitar
's class methods. It's called a metaclass. We'll denote the metaclass of Guitar
as Guitar′
. But that's not the whole story. Buecause Guitar
is a subclass of Object
, its metaclass Guitar′
will be a subclass of Object
's metaclass, Object′
. In Figure 19.2, we show these additional metaclasses.
When Ruby executes Guitar.strings()
, it follows the same process as before: it goes to the receiver, class Guitar
, follows the klass
reference to class Guitar′
, and finds the method.
Finally, note that an “S” has crept into the flags in class Guitar′
. The classes that Ruby creates automatically are marked internally as singleton classes. Singleton classes are treated slightly differently within Ruby. The most obvious difference from the outside is that they are effectively invisible: they will never appear in a list of objects returned from methods such as Module#ancestors
or ObjectSpace::each_object
.
Ruby allows you to create a class tied to a particular object. In the following example, we create two String
objects. We then associate an anonymous class with one of them, overriding one of the methods in the object's base class and adding a new method.
a = "hello"
b = a.dup
class <<a
def to_s
"The value is '#{self}'"
end
def twoTimes
self + self
end
end
a.to_s → "The value is 'hello'"
a.twoTimes → "hellohello"
b.to_s → "hello"
This example uses the “class <<
obj
” notation, which
basically says “build me a new class just for object obj
.” We
could also have written it as:
a = "hello"
b = a.dup
def a.to_s
"The value is '#{self}'"
end
def a.twoTimes
self + self
end
a.to_s → "The value is 'hello'"
a.twoTimes → "hellohello"
b.to_s → "hello"
The effect is the same in both cases: a class is added to the object
“a
”. This gives us a strong hint about the Ruby implementation: a singleton class is created and inserted as a
's direct class. a
's original class, String
, is made this singleton's superclass. The before and after pictures are shown in Figure 19.3.
Ruby performs a slight optimization with these singleton classes. If an object's klass
reference already points to a singleton class, a new one will not be created. This means that the first of the two method definitions in the previous example will create a singleton class, but the second will simply add a method to it.
When a class includes a module, that module's instance methods become available as instance methods of the class. It's almost as if the module becomes a superclass of the class that uses it. Not surprisingly, that's about how it works. When you include a module, Ruby creates an anonymous proxy class that references that module, and inserts that proxy as the direct superclass of the class that did the including. The proxy class contains references to the instance variables and methods of the module. This is important: the same module may be included in many different classes, and will appear in many different inheritance chains. However, thanks to the proxy class, there is still only one underlying module: change a method definition in that module, and it will change in all classes that include that module, both past and future.
module SillyModule
def hello
"Hello."
end
end
class SillyClass
include SillyModule
end
s = SillyClass.new
s.hello → "Hello."
module SillyModule
def hello
"Hi, there!"
end
end
s.hello → "Hi, there!"
The relationship between classes and the modules they include is shown in Figure 19.4. If multiple modules are included, they are added to the chain in order.
If a module itself includes other modules, a chain of proxy classes will be added to any class that includes that module, one proxy for each module that is directly or indirectly included.
Just as you can define an anonymous class for an object using “class <<
”, you can mix a module into an object using obj
Object#extend
. For example:
module Humor
def tickle
"hee, hee!"
end
end
a = "Grouchy"
a.extend Humor
a.tickle → "hee, hee!"
There is an interesting trick with extend
. If you use it within a class definition, the module's methods become class methods.
module Humor
def tickle
"hee, hee!"
end
end
class Grouchy
include Humor
extend Humor
end
Grouchy.tickle → "hee, hee!"
a = Grouchy.new
a.tickle → "hee, hee!"
This is because calling extend
is equivalent to self.extend
, so the methods are added to self
, which in a class definition is the class itself.
Having exhausted the combinations of classes and objects, we can (thankfully) get back to programming by looking at the nuts and bolts of class and module definitions.
In languages such as C++ and Java, class definitions are processed at compile time: the compiler loads up symbol tables, works out how much storage to allocate, constructs dispatch tables, and does all those other obscure things we'd rather not think too hard about.
Ruby is different. In Ruby, class and module definitions are executable code. Although parsed at compile time, the classes and modules are created at runtime, when the definition is encountered. (The same is also true of method definitions.) This allows you to structure your programs far more dynamically than in most conventional languages. You can make decisions once, when the class is being defined, rather than each time that objects of the class are used. The class in the following example decides as it is being defined what version of a decryption routine to create.
class MediaPlayer
include Tracing if $DEBUGGING
if ::EXPORT_VERSION
def decrypt(stream)
raise "Decryption not available"
end
else
def decrypt(stream)
# ...
end
end
end
If class definitions are executable code, this implies that they execute in the context of some object: self must reference something. Let's find out what it is.
class Test
puts "Type of self = #{self.type}"
puts "Name of self = #{self.name}"
end
produces:
Type of self = Class
Name of self = Test
This means that a class definition is executed with that class as the current object. Referring back to the section about metaclasses, we can see that this means that methods in the metaclass and its superclasses will be available during the execution of the method definition. We can check this out.
class Test
def Test.sayHello
puts "Hello from #{name}"
end
sayHello
end
produces:
Hello from Test
In this example we define a class method, Test.sayHello
, and then call it in the body of the class definition. Within sayHello
, we call name
, an instance method of class Module
. Because Module
is an ancestor of Class
, its instance methods can be called without an explicit receiver within a class definition.
In fact, many of the directives that you use when defining a class or module, things such as alias_method
, attr
, and public
, are simply methods in class Module
. This opens up some interesting possibilities—you can extend the functionality of class and module definitions by writing Ruby code. Let's look at a couple of examples.
As a first example, let's look at adding a basic documentation facility to modules and classes. This would allow us to associate a string with modules and classes that we write, a string that is accessible as the program is running. We'll choose a simple syntax.
class Example
doc "This is a sample documentation string"
# .. rest of class
end
We need to make doc
available to any module or class, so we need to make it an instance method of class Module
.
class Module
@@docs = Hash.new(nil)
def doc(str)
@@docs[self.name] = str
end
def Module::doc(aClass)
# If we're passed a class or module, convert to string
# ('<=' for classes checks for same class or subtype)
aClass = aClass.name if aClass.type <= Module
@@docs[aClass] || "No documentation for #{aClass}"
end
end
class Example
doc "This is a sample documentation string"
# .. rest of class
end
module Another
doc <<-edoc
And this is a documentation string
in a module
edoc
# rest of module
end
puts Module::doc(Example)
puts Module::doc("Another")
produces:
This is a sample documentation string
And this is a documentation string
in a module
The second example is a performance enhancement based on Tadayoshi Funaba's Date
module. Say we have a class that represents some underlying quantity (in this case, a date). The class may have many attributes that present the same underlying date in different ways: as a Julian day number, as a string, as a [year, month, day] triple, and so on. Each value represents the same date and may involve a fairly complex calculation to derive. We therefore would like to calculate each attribute only once, when it is first accessed.
The manual way would be to add a test to each accessor:
class ExampleDate
def initialize(dayNumber)
@dayNumber = dayNumber
end
def asDayNumber
@dayNumber
end
def asString
unless @string
# complex calculation
@string = result
end
@string
end
def asYMD
unless @ymd
# another calculation
@ymd = [ y, m, d ]
end
@ymd
end
# ...
end
This is a clunky technique—let's see if we can come up with something sexier.
What we're aiming for is a directive that indicates that the body of a particular method should be invoked only once. The value returned by that first call should be cached. Thereafter, calling that same method should return the cached value without reevaluating the method body again. This is similar to Eiffel's once
modifier for routines. We'd like to be able to write something like:
class ExampleDate
def asDayNumber
@dayNumber
end
def asString
# complex calculation
end
def asYMD
# another calculation
[ y, m, d ]
end
once :asString, :asYMD
end
We can use once
as a directive by writing it as a class method of ExampleDate
, but what should it look like internally? The trick is to have it rewrite the methods whose names it is passed. For each method, it creates an alias for the original code, then creates a new method with the same name. This new method does two things. First, it invokes the original method (using the alias) and stores the resulting value in an instance variable. Second, it redefines itself, so that on subsequent calls it simply returns the value of the instance variable directly. Here's Tadayoshi Funaba's code, slightly reformatted.
def ExampleDate.once(*ids)
for id in ids
module_eval <<-"end_eval"
alias_method :__#{id.to_i}__, #{id.inspect}
def #{id.id2name}(*args, &block)
def self.#{id.id2name}(*args, &block)
@__#{id.to_i}__
end
@__#{id.to_i}__ = __#{id.to_i}__(*args, &block)
end
end_eval
end
end
This code uses module_eval
to execute a block of code in the context of the calling module (or, in this case, the calling class). The original method is renamed __nnn__, where the nnn part is the integer representation of the method name's symbol id. The code uses the same name for the caching instance variable. The bulk of the code is a method that dynamically redefines itself. Note that this redefinition uses the fact that methods may contain nested singleton method definitions, a clever trick.
Understand this code, and you'll be well on the way to true Ruby mastery.
However, we can take it further. Look in the date
module, and you'll see method once
written slightly differently.
class Date
class << self
def once(*ids)
# ...
end
end
# ...
end
The interesting thing here is the inner class definition,
“class << self
”. This defines a class based on the object self
, and self
happens to be the class object for Date
. The result? Every method within the inner class definition is automatically a class method of Date
.
The once
feature is generally applicable—it should work for any class. If you took once
and made it a private instance method of class Module
, it would be available for use in any Ruby class.
We've said that when you invoke a class method, all you're doing is sending a message to the Class
object itself. When you say something such as String.new("gumby")
, you're sending the message new
to the object that is class String
. But how does Ruby know to do this? After all, the receiver of a message should be an object reference, which implies that there must be a constant called “String” somewhere containing a reference to the String
object. (It will be a constant, not a variable,
because “String” starts with an uppercase letter.) And in fact, that's exactly what happens. All the built-in classes, along with the classes you define, have a corresponding global constant with the same name as the class. This is both straightforward and subtle. The subtlety comes from the fact that there are actually two things named (for example) String
in the system. There's a constant that references an object of class String
, and there's the object itself.
The fact that class names are just constants means that you can treat classes just like any other Ruby object: you can copy them, pass them to methods, and use them in expressions.
def factory(klass, *args)
klass.new(*args)
end
factory(String, "Hello") → "Hello"
factory(Dir, ".") → #<Dir:0x401b51bc>
flag = true
(flag ? Array : Hash)[1, 2, 3, 4] → [1, 2, 3, 4]
flag = false
(flag ? Array : Hash)[1, 2, 3, 4] → {1=>2, 3=>4}
Many times in this book we've claimed that everything in Ruby is an object. However, there's one thing that we've used time and time again that appears to contradict this—the top-level Ruby execution environment.
puts "Hello, World"
Not an object in sight. We may as well be writing some variant of Fortran or QW-Basic. But dig deeper, and you'll come across objects and classes lurking in even the simplest code.
We know that the literal "Hello, World"
generates a Ruby String
, so there's one object. We also know that the bare method call to puts
is effectively the same as self.puts
. But what is “self”?
self.type → Object
At the top level, we're executing code in the context of some predefined object. When we define methods, we're actually creating (private) singleton methods for this object. Instance variables belong to this object. And because we're in the context of Object
, we can use all of Object
's methods (including those mixed-in from Kernel
) in function form. This explains why we can call Kernel
methods such as puts
at the top level (and indeed throughout Ruby): these methods are part of every object.
There's one last wrinkle to class inheritance, and it's fairly obscure.
Within a class definition, you can change the visibility of a method in an ancestor class. For example, you can do something like:
class Base
def aMethod
puts "Got here"
end
private :aMethod
end
class Derived1 < Base
public :aMethod
end
class Derived2 < Base
end
In this example, you would be able to invoke aMethod
in instances of class Derived1
, but not via instances of Base
or Derived2
.
So how does Ruby pull off this feat of having one method with two different visibilities? Simply put, it cheats.
If a subclass changes the visibility of a method in a parent, Ruby effectively inserts a hidden proxy method in the subclass that invokes the original method using super
. It then sets the visibility of that proxy to whatever you requested. This means that the code:
class Derived1 < Base
public :aMethod
end
is effectively the same as:
class Derived1 < Base
def aMethod(*args)
super
end
public :aMethod
end
The call to super
can access the parent's method regardless of
its visibility, so the rewrite allows the subclass to override its
parent's visibility rules. Pretty scary, eh?
There are times when you've worked hard to make your object exactly right, and you'll be damned if you'll let anyone just change it. Perhaps you need to pass some kind of opaque object between two of your classes via some third-party object, and you want to make sure it arrives unmodified. Perhaps you want to use an object as a hash key, and need to make sure that no one modifies it while it's being used. Perhaps something is corrupting one of your objects, and you'd like Ruby to raise an exception as soon as the change occurs.
Ruby provides a very simple mechanism to help with this. Any object can be frozen by invoking Object#freeze
. A frozen object may not be modified: you can't change its instance variables (directly or indirectly), you can't associate singleton methods with it, and, if it is a class or module, you can't add, delete, or modify its methods. Once frozen, an object stays frozen: there is no Object#thaw
. You can test to see if an object is frozen using Object#frozen?
.
What happens when you copy a frozen object? That depends on the method you use. If you call an object's clone
method, the entire object state (including whether it is frozen) is copied to the new object. On the other hand, dup
typically copies only the object's contents—the new copy will not inherit the frozen status.
str1 = "hello"
str1.freeze → "hello"
str1.frozen? → true
str2 = str1.clone
str2.frozen? → true
str3 = str1.dup
str3.frozen? → false
Although freezing objects may initially seem like a good idea, you might want to hold off doing it until you come across a real need. Freezing is one of those ideas that looks essential on paper but isn't used much in practice.
Extracted from the book "Programming Ruby - The Pragmatic Programmer's Guide"
Copyright © 2001 by Addison Wesley Longman, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/).
Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder.
Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.