Classes and Objects

Classes and objects are obviously central to Ruby, but at first sight they can seem a little confusing. There seem to be a lot of concepts: classes, objects, class objects, instance methods, class methods, and singleton classes. In reality, however, Ruby has just a single underlying class and object structure, which we'll discuss in this chapter. In fact, the basic model is so simple, we can describe it in a single paragraph.

A Ruby object has three components: a set of flags, some instance variables, and an associated class. A Ruby class is an object of class Class, which contains all the object things plus a list of methods and a reference to a superclass (which is itself another class). All method calls in Ruby nominate a receiver (which is by default self, the current object). Ruby finds the method to invoke by looking at the list of methods in the receiver's class. If it doesn't find the method there, it looks in the superclass, and then in the superclass's superclass, and so on. If the method cannot be found in the receiver's class or any of its ancestors, Ruby invokes the method method_missing on the original receiver.

“But wait,” you cry, “I spent good money on this chapter. What about all this other stuff—singleton classes, class methods, and so on. How do they work?”

How Classes and Objects Interact

All class/object interactions are explained using the simple model given above: objects reference classes, and classes reference zero or more superclasses. However, the implementation details can get a tad tricky.

We've found that the simplest way of visualizing all this is to draw the actual objects that Ruby implements. So, in the following pages we'll look at all the possible combinations of classes and objects. Note that these are not class diagrams in the UML sense; we're showing structures in memory and pointers between them.

Your Basic, Everyday Object

Let's start by looking at an object created from a simple class. Figure 19.1 shows an object referenced by a variable, lucille, the object's class, Guitar, and that class's superclass, Object. Notice how the object's class reference (called klass for historical reasons that really bug Andy) points to the class object, and how the super pointer from that class references the parent class.

If we invoke the method lucille.play(), Ruby goes to the receiver, lucille, and follows the klass reference to the class object for Guitar. It searches the method table, finds play, and invokes it.

If instead we call lucile.display(), Ruby starts off the same way, but cannot find display in the method table in the class Guitar. It then follows the super reference to Guitar's superclass, Object, where it find and executes the method.

What's the Meta?

Astute readers (yup, that's all of you) will have noticed that the klass members of Class objects point to nothing meaningful in Figure 19.1. We now have all the information we need to work out what they should reference.

When you say lucille.play(), Ruby follows lucille's klass pointer to find a class object in which to search for methods. So what happens when you invoke a class method, such as Guitar.strings(...)? Here the receiver is the class object itself, Guitar. So, to be consistent, we need to stick the methods in some other class, referenced from Guitar's klass pointer. This new class will contain all of Guitar's class methods. It's called a metaclass. We'll denote the metaclass of Guitar as Guitar′. But that's not the whole story. Buecause Guitar is a subclass of Object, its metaclass Guitar′ will be a subclass of Object's metaclass, Object′. In Figure 19.2, we show these additional metaclasses.

When Ruby executes Guitar.strings(), it follows the same process as before: it goes to the receiver, class Guitar, follows the klass reference to class Guitar′, and finds the method.

Finally, note that an “S” has crept into the flags in class Guitar′. The classes that Ruby creates automatically are marked internally as singleton classes. Singleton classes are treated slightly differently within Ruby. The most obvious difference from the outside is that they are effectively invisible: they will never appear in a list of objects returned from methods such as Module#ancestors or ObjectSpace::each_object.

Object-Specific Classes

Ruby allows you to create a class tied to a particular object. In the following example, we create two String objects. We then associate an anonymous class with one of them, overriding one of the methods in the object's base class and adding a new method.

a = "hello"
b = a.dup

class <<a
  def to_s
    "The value is '#{self}'"
  end
  def twoTimes
    self + self
  end
end

a.to_s → "The value is 'hello'"
a.twoTimes → "hellohello"
b.to_s → "hello"

This example uses the “class <<obj” notation, which basically says “build me a new class just for object obj.” We could also have written it as:

a = "hello"
b = a.dup
def a.to_s
  "The value is '#{self}'"
end
def a.twoTimes
  self + self
end

a.to_s → "The value is 'hello'"
a.twoTimes → "hellohello"
b.to_s → "hello"

The effect is the same in both cases: a class is added to the object “a”. This gives us a strong hint about the Ruby implementation: a singleton class is created and inserted as a's direct class. a's original class, String, is made this singleton's superclass. The before and after pictures are shown in Figure 19.3.

Ruby performs a slight optimization with these singleton classes. If an object's klass reference already points to a singleton class, a new one will not be created. This means that the first of the two method definitions in the previous example will create a singleton class, but the second will simply add a method to it.

Mixin Modules

When a class includes a module, that module's instance methods become available as instance methods of the class. It's almost as if the module becomes a superclass of the class that uses it. Not surprisingly, that's about how it works. When you include a module, Ruby creates an anonymous proxy class that references that module, and inserts that proxy as the direct superclass of the class that did the including. The proxy class contains references to the instance variables and methods of the module. This is important: the same module may be included in many different classes, and will appear in many different inheritance chains. However, thanks to the proxy class, there is still only one underlying module: change a method definition in that module, and it will change in all classes that include that module, both past and future.

module SillyModule
  def hello
    "Hello."
  end
end
class SillyClass
  include SillyModule
end
s = SillyClass.new
s.hello → "Hello."


module SillyModule
  def hello
    "Hi, there!"
  end
end
s.hello → "Hi, there!"

The relationship between classes and the modules they include is shown in Figure 19.4. If multiple modules are included, they are added to the chain in order.

If a module itself includes other modules, a chain of proxy classes will be added to any class that includes that module, one proxy for each module that is directly or indirectly included.

Extending Objects

Just as you can define an anonymous class for an object using “class <<obj”, you can mix a module into an object using Object#extend. For example:

module Humor
  def tickle
    "hee, hee!"
  end
end

a = "Grouchy"
a.extend Humor
a.tickle → "hee, hee!"

There is an interesting trick with extend. If you use it within a class definition, the module's methods become class methods.

module Humor
  def tickle
    "hee, hee!"
  end
end

class Grouchy
  include Humor
  extend  Humor
end

Grouchy.tickle → "hee, hee!"
a = Grouchy.new
a.tickle → "hee, hee!"

This is because calling extend is equivalent to self.extend, so the methods are added to self, which in a class definition is the class itself.

Class and Module Definitions

Having exhausted the combinations of classes and objects, we can (thankfully) get back to programming by looking at the nuts and bolts of class and module definitions.

In languages such as C++ and Java, class definitions are processed at compile time: the compiler loads up symbol tables, works out how much storage to allocate, constructs dispatch tables, and does all those other obscure things we'd rather not think too hard about.

Ruby is different. In Ruby, class and module definitions are executable code. Although parsed at compile time, the classes and modules are created at runtime, when the definition is encountered. (The same is also true of method definitions.) This allows you to structure your programs far more dynamically than in most conventional languages. You can make decisions once, when the class is being defined, rather than each time that objects of the class are used. The class in the following example decides as it is being defined what version of a decryption routine to create.

class MediaPlayer
  include Tracing if $DEBUGGING

  if ::EXPORT_VERSION
    def decrypt(stream)
      raise "Decryption not available"
    end
  else
    def decrypt(stream)
      # ...
    end
  end

end

If class definitions are executable code, this implies that they execute in the context of some object: self must reference something. Let's find out what it is.

class Test
  puts "Type of self = #{self.type}"
  puts "Name of self = #{self.name}"
end

This means that a class definition is executed with that class as the current object. Referring back to the section about metaclasses, we can see that this means that methods in the metaclass and its superclasses will be available during the execution of the method definition. We can check this out.

class Test
  def Test.sayHello
	puts "Hello from #{name}"
  end

  sayHello
end

In this example we define a class method, Test.sayHello, and then call it in the body of the class definition. Within sayHello, we call name, an instance method of class Module. Because Module is an ancestor of Class, its instance methods can be called without an explicit receiver within a class definition.

In fact, many of the directives that you use when defining a class or module, things such as alias_method, attr, and public, are simply methods in class Module. This opens up some interesting possibilities—you can extend the functionality of class and module definitions by writing Ruby code. Let's look at a couple of examples.

As a first example, let's look at adding a basic documentation facility to modules and classes. This would allow us to associate a string with modules and classes that we write, a string that is accessible as the program is running. We'll choose a simple syntax.

class Example
  doc "This is a sample documentation string"
  # .. rest of class
end

We need to make doc available to any module or class, so we need to make it an instance method of class Module.

class Module
  @@docs = Hash.new(nil)
  def doc(str)
    @@docs[self.name] = str
  end

  def Module::doc(aClass)
    # If we're passed a class or module, convert to string
    # ('<=' for classes checks for same class or subtype)
    aClass = aClass.name if aClass.type <= Module
    @@docs[aClass] || "No documentation for #{aClass}"
  end
end

class Example
  doc "This is a sample documentation string"
  # .. rest of class
end

module Another
  doc <<-edoc
    And this is a documentation string
    in a module
  edoc
  # rest of module
end

puts Module::doc(Example)
puts Module::doc("Another")

This is a sample documentation string
      And this is a documentation string
      in a module

The second example is a performance enhancement based on Tadayoshi Funaba's Date module. Say we have a class that represents some underlying quantity (in this case, a date). The class may have many attributes that present the same underlying date in different ways: as a Julian day number, as a string, as a [year, month, day] triple, and so on. Each value represents the same date and may involve a fairly complex calculation to derive. We therefore would like to calculate each attribute only once, when it is first accessed.

class ExampleDate
  def initialize(dayNumber)
    @dayNumber = dayNumber
  end

  def asDayNumber
    @dayNumber
  end

  def asString
    unless @string
      # complex calculation
      @string = result
    end
    @string
  end

  def asYMD
    unless @ymd
      # another calculation
      @ymd = [ y, m, d ]
    end
    @ymd
  end
  # ...
end

What we're aiming for is a directive that indicates that the body of a particular method should be invoked only once. The value returned by that first call should be cached. Thereafter, calling that same method should return the cached value without reevaluating the method body again. This is similar to Eiffel's once modifier for routines. We'd like to be able to write something like:

class ExampleDate
  def asDayNumber
    @dayNumber
  end

  def asString
    # complex calculation
  end

  def asYMD
    # another calculation
    [ y, m, d ]
  end

  once :asString, :asYMD
end

We can use once as a directive by writing it as a class method of ExampleDate, but what should it look like internally? The trick is to have it rewrite the methods whose names it is passed. For each method, it creates an alias for the original code, then creates a new method with the same name. This new method does two things. First, it invokes the original method (using the alias) and stores the resulting value in an instance variable. Second, it redefines itself, so that on subsequent calls it simply returns the value of the instance variable directly. Here's Tadayoshi Funaba's code, slightly reformatted.

def ExampleDate.once(*ids)
  for id in ids
    module_eval <<-"end_eval"
      alias_method :__#{id.to_i}__, #{id.inspect}
      def #{id.id2name}(*args, &block)
        def self.#{id.id2name}(*args, &block)
          @__#{id.to_i}__
        end
        @__#{id.to_i}__ = __#{id.to_i}__(*args, &block)
      end
    end_eval
  end
end

This code uses module_eval to execute a block of code in the context of the calling module (or, in this case, the calling class). The original method is renamed __nnn__, where the nnn part is the integer representation of the method name's symbol id. The code uses the same name for the caching instance variable. The bulk of the code is a method that dynamically redefines itself. Note that this redefinition uses the fact that methods may contain nested singleton method definitions, a clever trick.

However, we can take it further. Look in the date module, and you'll see method once written slightly differently.

The interesting thing here is the inner class definition, “class << self”. This defines a class based on the object self, and self happens to be the class object for Date. The result? Every method within the inner class definition is automatically a class method of Date.

The once feature is generally applicable—it should work for any class. If you took once and made it a private instance method of class Module, it would be available for use in any Ruby class.

Class Names Are Constants

We've said that when you invoke a class method, all you're doing is sending a message to the Class object itself. When you say something such as String.new("gumby"), you're sending the message new to the object that is class String. But how does Ruby know to do this? After all, the receiver of a message should be an object reference, which implies that there must be a constant called “String” somewhere containing a reference to the String object. (It will be a constant, not a variable, because “String” starts with an uppercase letter.) And in fact, that's exactly what happens. All the built-in classes, along with the classes you define, have a corresponding global constant with the same name as the class. This is both straightforward and subtle. The subtlety comes from the fact that there are actually two things named (for example) String in the system. There's a constant that references an object of class String, and there's the object itself.

The fact that class names are just constants means that you can treat classes just like any other Ruby object: you can copy them, pass them to methods, and use them in expressions.

def factory(klass, *args)
  klass.new(*args)
end

factory(String, "Hello") → "Hello"
factory(Dir,    ".") → #<Dir:0x401b51bc>

flag = true
(flag ? Array : Hash)[1, 2, 3, 4] → [1, 2, 3, 4]
flag = false
(flag ? Array : Hash)[1, 2, 3, 4] → {1=>2, 3=>4}

Top-Level Execution Environment

Many times in this book we've claimed that everything in Ruby is an object. However, there's one thing that we've used time and time again that appears to contradict this—the top-level Ruby execution environment.

Not an object in sight. We may as well be writing some variant of Fortran or QW-Basic. But dig deeper, and you'll come across objects and classes lurking in even the simplest code.

We know that the literal "Hello, World" generates a Ruby String, so there's one object. We also know that the bare method call to puts is effectively the same as self.puts. But what is “self”?

At the top level, we're executing code in the context of some predefined object. When we define methods, we're actually creating (private) singleton methods for this object. Instance variables belong to this object. And because we're in the context of Object, we can use all of Object's methods (including those mixed-in from Kernel) in function form. This explains why we can call Kernel methods such as puts at the top level (and indeed throughout Ruby): these methods are part of every object.

Inheritance and Visibility

Within a class definition, you can change the visibility of a method in an ancestor class. For example, you can do something like:

class Base
  def aMethod
    puts "Got here"
  end
  private :aMethod
end

class Derived1 < Base
  public :aMethod
end

class Derived2 < Base
end

In this example, you would be able to invoke aMethod in instances of class Derived1, but not via instances of Base or Derived2.

So how does Ruby pull off this feat of having one method with two different visibilities? Simply put, it cheats.

If a subclass changes the visibility of a method in a parent, Ruby effectively inserts a hidden proxy method in the subclass that invokes the original method using super. It then sets the visibility of that proxy to whatever you requested. This means that the code:

class Derived1 < Base
  def aMethod(*args)
    super
  end
  public :aMethod
end

The call to super can access the parent's method regardless of its visibility, so the rewrite allows the subclass to override its parent's visibility rules. Pretty scary, eh?

Freezing Objects

There are times when you've worked hard to make your object exactly right, and you'll be damned if you'll let anyone just change it. Perhaps you need to pass some kind of opaque object between two of your classes via some third-party object, and you want to make sure it arrives unmodified. Perhaps you want to use an object as a hash key, and need to make sure that no one modifies it while it's being used. Perhaps something is corrupting one of your objects, and you'd like Ruby to raise an exception as soon as the change occurs.

Ruby provides a very simple mechanism to help with this. Any object can be frozen by invoking Object#freeze. A frozen object may not be modified: you can't change its instance variables (directly or indirectly), you can't associate singleton methods with it, and, if it is a class or module, you can't add, delete, or modify its methods. Once frozen, an object stays frozen: there is no Object#thaw. You can test to see if an object is frozen using Object#frozen?.

What happens when you copy a frozen object? That depends on the method you use. If you call an object's clone method, the entire object state (including whether it is frozen) is copied to the new object. On the other hand, dup typically copies only the object's contents—the new copy will not inherit the frozen status.

str1 = "hello"
str1.freeze → "hello"
str1.frozen? → true
str2 = str1.clone
str2.frozen? → true
str3 = str1.dup
str3.frozen? → false

Although freezing objects may initially seem like a good idea, you might want to hold off doing it until you come across a real need. Freezing is one of those ideas that looks essential on paper but isn't used much in practice.

Extracted from the book "Programming Ruby - The Pragmatic Programmer's Guide"

Copyright © 2001 by Addison Wesley Longman, Inc. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/).

Distribution of substantively modified versions of this document is prohibited without the explicit permission of the copyright holder.

Distribution of the work or derivative of the work in any standard (paper) book form is prohibited unless prior permission is obtained from the copyright holder.