How-to's and Support

Tracking Object Allocation in Ruby

Written by: Jesus Castello

5 min read

Whenever you do something like MyClass.new, Ruby creates a new object, which uses a little bit of memory. But that's not the only way you are creating objects. Many actions will create objects, including strings and arrays. Even if you don't say String.new or Array.new, it's still a new object that is being created for you.

Because memory is not unlimited and memory allocation has an impact on performance, it's important to understand the answer to the following questions:

  • When is exactly a new object created and why?

  • Are there tools to help you see when your code is creating new objects?

In this article, I want to answer these questions for you, starting with the ObjectSpace module, which comes equipped with a method to help you track down new object allocations.

Let's Start Tracking!

Tracking object allocations requires a new feature introduced in Ruby 2.1. To be more specific, I'm talking about the ObjectSpace.trace_object_allocations method.

You can use it like this:

require 'objspace'
ObjectSpace.trace_object_allocations do
  obj = Object.new
  puts "File: #{ObjectSpace.allocation_sourcefile(obj)}"
  puts "Line: #{ObjectSpace.allocation_sourceline(obj)}"
end
# File: trace.rb
# Line: 4

That's good, but let's take a look at the allocation_stats gem. With this gem, you can print all sorts of reports about your object allocation.

Let's see an example:

require 'allocation_stats'
class Foo
  def bar
    @hash = {
      1 => "foo",
      2 => "bar"
    }
  end
end
stats = AllocationStats.trace { Foo.new.bar }
puts stats.allocations(alias_paths: true).to_text

In this example, we have a class (Foo) with one method (bar). The methods creates a new hash (@hash) with two strings as values. Then we use AllocationStats.trace to capture the object allocations. And we call the allocations method to generate this table:

sourcefile  sourceline  class_path  method_id  memsize   class
    ----------  ----------  ----------  ---------  -------  -------
    (pry)                5  Foo         bar        116      Hash
    (pry)                5  Foo         bar        20       String
    (pry)                5  Foo         bar        20       String
    (pry)                8  Class       new        20       Foo

There are a few columns, most notably method_id is the name of the method that generated this object allocation. And memsize is the size of the object in bytes.

Let's see what else can you do with this gem!

Grouping Data by Class

You can use the group_by method to group data. This allows you to count how many objects for every class were created.

puts stats
      .allocations
      .group_by(:sourcefile, :sourceline, :class)
      .to_text
sourcefile  sourceline   class   count
    ----------  ----------  -------  -----
    (pry)               19  Hash         1
    (pry)               19  String       2
    (pry)               26  Class        1

In this example, we are creating two String objects, one Hash object, and one Class object.

Filtering Data

Using the where method, you can filter using any of the columns available. This will help you focus on what you want to see.

Example:

puts stats
      .allocations
      .group_by(:class)
      .where(class: String)
      .to_text
class   count
    ------  -----
    String      2

If you have used ActiveRecord a few times, you are probably very familiar with this kind of syntax where you chain a set of methods to build your final result.

This is the Builder Design Pattern in action!

Tracking Array Allocations

Here's another example where I create a class with an array of strings.

stats = AllocationStats.trace {
  class Abc
    @test = %w(a b c d)
  end
}
puts stats.allocations
  .group_by(:sourcefile, :sourceline, :class_plus)
  .where(sourcefile: "(pry)")
  .sort_by_count
  .to_text

Notice that using class_plus instead of class will give you some extra info. In this case, Array<String> means that the elements inside the array are strings.

sourcefile  sourceline   class_plus    count
    ----------  ----------  -------------  -----
    (pry)              184  String             4
    (pry)              184  Array<String>      1

And as expected, this example is creating four strings and one array. No surprises here.

Frozen String Versus Non-frozen String

Freezing a string reduces object allocations because the same object is reused, instead of creating new objects with the same value.

Here's an example:

stats = AllocationStats.trace {
  Array('a'..'z').map { |ch| ch + 'aa' }
}
puts stats.allocations
  .group_by(:sourcefile, :sourceline, :class)
  .sort_by_count
  .to_text
sourcefile  sourceline  class   count
    ----------  ----------  ------  -----
    (pry)              123  String     80
    (pry)              123  Array       2
    (pry)              123  Range       1

Let's try it with freezing now:

stats = AllocationStats.trace {
  Array('a'..'z').map { |ch| ch + 'aa'.freeze }
}
puts stats.allocations
  .group_by(:sourcefile, :sourceline, :class)
  .sort_by_count
  .to_text
sourcefile  sourceline  class   count
    ----------  ----------  ------  -----
    (pry)              131  String     54
    (pry)              131  Array       2
    (pry)              131  Range       1

We went from 80 string allocations to 54. Not a big change, but the allocation stats allow us to study the impact of freezing these strings.

If you are wondering why it went from 80 to 54, it is because we are iterating 26 times, as you can see here:

Array('a'..'z').size
# 26

!Sign up for a free Codeship Account

Lazy Enumerators

This is a feature that was introduced in Ruby 2.0 and lets you write more efficient chains of Enumerable methods.

Let's see how it affects object allocations.

Example:

tats = AllocationStats.trace {
  range = ('a'..'z').to_a
  random_letters = Array.new(500) { range.sample }
  random_letters.map { |ch| ch * 2 }.select { |ch| ch.size > 1 }.first(5)
}
puts stats.allocations
  .group_by(:sourcefile, :sourceline, :class, :method_id)
  .sort_by_count
  .to_text
sourcefile  sourceline  class   method_id  count
    ----------  ----------  ------  ---------  -----
    (pry)             1340  String  *            500
    (pry)             1337  String  upto          26
    (pry)             1337  String                 2
    (pry)             1340  Array   map            1
    (pry)             1338  Array   new            1
    (pry)             1337  Array   to_a           1
    (pry)             1337  Range                  1
    (pry)             1340  Array   first          1
    (pry)             1340  Array   select         1

What's happening inside map and select is not very interesting, but notice how we are allocating 500 strings and we just want five (because first(5)).

Now if we add lazy before map, most of those String allocations go away!

Example:

stats = AllocationStats.trace {
  range          = ('a'..'z').to_a
  random_letters = Array.new(500) { range.sample }
  random_letters.lazy.map { |ch| ch * 2 }.select { |ch| ch.size > 1 }.first(5)
}
puts stats
      .allocations
      .group_by(:sourcefile, :sourceline, :class, :method_id)
      .sort_by_count
      .to_text
sourcefile  sourceline          class          method_id   count
    ----------  ----------  ---------------------  ----------  -----
    (pry)             1520  String                 upto           26
    (pry)             1523  Array                  yield          10
    (pry)             1523  Proc                   initialize      6
    (pry)             1523  String                 *               5
    (pry)             1523  RubyVM::Env            initialize      5
    (pry)             1523  Proc                   proc            4
    (pry)             1523  RubyVM::Env            proc            4
    (pry)             1523  Enumerator::Yielder    each            2
    (pry)             1523  Array                  each            2
    (pry)             1523  Enumerator::Generator  initialize      2
    (pry)             1523  Enumerator::Lazy       new             2
    (pry)             1520  String                                 2
    (pry)             1523  Array                  first           1

This also creates other objects, so it may not be worth it for small arrays. When in doubt, always run a benchmark between your options.

Here is a relevant quote for you:

"Premature optimization is the root of all evil." - Donald Knuth

Careful With Loops!

Loops can create lots of objects if you don't pay attention. In the following example, you want to pull the inner array (Array('a'..'z')) outside of the block so it's only created once.

stats = AllocationStats.trace {
  Array.new(500) { Array('a'..'z').sample }
}
puts stats.allocations
  .group_by(:sourcefile, :sourceline, :class)
  .sort_by_count
  .to_text

Output:

sourcefile  sourceline  class   count
----------  ----------  ------  -----
(pry)              993  String  14000
(pry)              993  Array     501
(pry)              993  Range     500

Yes, most of these strings will be garbage collected, but there is an impact on performance because of that.

Other Memory-Inspection Tools

Let's finish this article with a quick overview of other memory-related tools that you can use:

With memory_profiler and derailed, you will get similar results to allocation_stats but presented in a different way.

If you are looking to reduce the memory footprint of your Rails app, a good command to run is bundle exec derailed bundle:mem. This will show you a report of how much memory your gems need.

If you suspect a memory leak, you could try bundle exec derailed exec perf:mem_over_time. This hits your app with requestsand reports the memory usage at regular intervals.

Finally, we have heap analyzers.

The heap is a section of memory where Ruby stores your objects. Since Ruby 2.1, you have the ability to dump the heap into a file (using require 'objspace' and ObjectSpace.dump_all) so you can inspect it.

You can use heap-analyzer and heapy to help you analyze your heap dump.

Conclusion

You have learned how to track object allocations and how to visualize them using the allocation_stats gem. You have also learned about other memory-related tools like memory_profiler andheapy.

Stay up-to-date with the latest insights

Sign up today for the CloudBees newsletter and get our latest and greatest how-to’s and developer insights, product updates and company news!