What happens when you call Object.new?

by Paweł Świątkowski
15 May 2015

While reading RubyGems source code for second or third time this week, I encountered something weird, which led me to this little investigation. Namely, in one class there was a constructor with def initialize, but just above it there was a static method definition with def self.new. How did it work? Soon I found out.

MyClass.new obviously looks like calling a class static method, though I always thought that there is some magic involved there, which chose to call initialize instead. Turned out I was wrong. Luckily MRI’s code is readable enough to track it there. So, let’s have a look:

VALUE
rb_class_new_instance(int argc, VALUE *argv, VALUE klass)
{
    VALUE obj;

    obj = rb_obj_alloc(klass);
    rb_obj_call_init(obj, argc, argv);

    return obj;
}

This is what static method new of class Object looks like. We can see some memory allocation there and then rb_obj_call_init(obj, argc, argv) which, as one could guess, calls initialize of newly allocated object. And one would be right:

void
rb_obj_call_init(VALUE obj, int argc, VALUE *argv)
{
    PASS_PASSED_BLOCK();
    rb_funcall2(obj, idInitialize, argc, argv);
}

Even though PASS_PASSED_BLOCK is some kind of weird macro I don’t really understand (yes, my C-fu is weak), it is followed by something that clearly looks like calling a idInitialize (which is an alias for “initialize” word from REGISTER_SYMID(idInitialize, "initialize");) on obj with arguments.

For reference, this is how Rubinius implements it:

def new(*args)
  obj = allocate()

  Rubinius.asm(args, obj) do |args, obj|
    run obj
    run args
    push_block
    send_with_splat :initialize, 0, true
    # no pop here, as .asm blocks imply a pop as they're not
    # allowed to leak a stack value
  end

  obj
end

So, what can we use this knowledge for? Let’s jump back to RubyGems source. The overridden self.new there looks like this:

class Gem::Package
 def self.new gem, security_policy = nil
    gem = if gem.is_a?(Gem::Package::Source)
            gem
          elsif gem.respond_to? :read
            Gem::Package::IOSource.new gem
          else
            Gem::Package::FileSource.new gem
          end

    return super unless Gem::Package == self
    return super unless gem.present?

    return super unless gem.start
    return super unless gem.start.include? 'MD5SUM ='

    Gem::Package::Old.new gem
  end
end

Basically, there are some checks on the argument and then, if those indicate old format of the gem, different class is instantiated and returned. Familiar? Well, yes, this sounds more or less like Factory design pattern. And we can use it for that. Consider following code:

class Vehicle
	def self.new(type)
		if type =~ /car/i
			Car.new(type.gsub('car', ''))
		else
			super
		end
	end

	def initialize(type)
		puts "vehicle of type: #{type}"
	end
end

class Car
	def initialize(type)
		puts "car of type: #{type}"
	end
end

Vehicle.new("snowcat")
#=> vehicle of type: snowcat
Car.new("hatchback")
#=> car of type: hatchback
Vehicle.new("race car")
#=> car of type: race

Pretty neat, huh? There is, however one big caveat here: the other class you are instantiating cannot inherit from the class that is being called or you end up with stack level too deep, which is completely understandable. Another way to do it is to define self.new in children classes as follows:

 
def self.new(*args)
	super(*args)
end

This might not be the most useful trick in the world, but experimenting with it brought me closed to understand how Ruby works.

What happens when you call Object.new?

by Paweł Świątkowski 15 May 2015

by Paweł Świątkowski
15 May 2015