Next performance fix: Builder::XChar

Posted by Nick Sieger Thu, 17 Jan 2008 23:48:00 GMT

Next up in our performance series: Builder::XChar. (Another fine Sam Ruby production!) While this piece of code in the Builder library strikes me as perfectly fine, it also tends to slow down quite a bit with larger documents or chunks of text.

Our path to the bottleneck is as follows: ActiveRecord::Base#to_xml => Builder::XMLMarkup#text! => String#to_xs => Fixnum#xchr. Consider:

require 'rubygems'
gem 'activesupport'
require 'active_support'
require 'benchmark'

module Benchmark
  class << self
    def report(&block)
      n = 10
      times = (1..10).map do
        bm = measure(&block)
        puts bm
        bm
      end
      sum = times.inject(0) {|s,t| s + t.real}
      mean = sum / n
      sumsq = times.inject(0) {|s,t| s + t.real * t.real}
      sd = Math.sqrt((sumsq - (sum * sum / n)) / (n - 1))
      puts("Mean: %0.6f SDev: %0.6f" % [mean, sd])
    end
  end
end

# http://blog.nicksieger.com/files/page.xml
page = File.open("page.xml") {|f| f.read }

Benchmark.report do
  20.times { page.to_xs }
end

On Ruby and JRuby, this produces:

$ ruby to_xs.rb 
 21.430000   0.400000  21.830000 ( 22.022769)
 21.530000   0.360000  21.890000 ( 22.005737)
 21.540000   0.370000  21.910000 ( 22.065165)
 21.530000   0.370000  21.900000 ( 22.028591)
 21.500000   0.350000  21.850000 ( 21.990395)
 21.550000   0.370000  21.920000 ( 22.033164)
 21.520000   0.360000  21.880000 ( 21.984129)
 21.550000   0.370000  21.920000 ( 22.116802)
 21.550000   0.370000  21.920000 ( 22.051421)
 21.520000   0.380000  21.900000 ( 22.084736)
Mean: 22.038291 SDev: 0.041985

$ jruby -J-server to_xs.rb
 79.112000   0.000000  79.112000 ( 79.112000)
 81.480000   0.000000  81.480000 ( 81.481000)
 84.745000   0.000000  84.745000 ( 84.745000)
 84.384000   0.000000  84.384000 ( 84.384000)
121.933000   0.000000 121.933000 (121.933000)
 85.533000   0.000000  85.533000 ( 85.532000)
 82.762000   0.000000  82.762000 ( 82.763000)
 82.090000   0.000000  82.090000 ( 82.090000)
 81.298000   0.000000  81.298000 ( 81.299000)
 80.774000   0.000000  80.774000 ( 80.773000)
Mean: 86.411200 SDev: 12.635700

(Hmm, I must have accidentally swapped in some large program in the middle of that JRuby run. The perils of benchmarking on a desktop machine. I don’t claim that the numbers are scientific, just illustrative!)

Fortunately, the fix again is very simple, and has previously been acknowledged. The latest (unreleased?) Hpricot has a new native extension, fast_xs, which is an almost drop-in replacement for the pure-ruby String#to_xs. (Almost, because it creates the method String#fast_xs instead of String#to_xs. ActiveSupport 2.0.2 and later take care of aliasing it for you). Unbeknownst to me, I ported fast_xs recently as part of upgrading JRuby extensions that have Java code in them. And so it happens to come in handy at this time. The patch for that is here.

I have the latest Hpricot gems on my server, so you can install it yourself (for either Ruby or JRuby):

gem install hpricot --source http://caldersphere.net

or

jruby -S gem install hpricot --source http://caldersphere.net

With that installed, the script now produces these results:

$ ruby to_xs.rb
  0.460000   0.080000   0.540000 (  0.537793)
  0.420000   0.070000   0.490000 (  0.501965)
  0.430000   0.070000   0.500000 (  0.501359)
  0.400000   0.070000   0.470000 (  0.484495)
  0.400000   0.070000   0.470000 (  0.479995)
  0.400000   0.070000   0.470000 (  0.469118)
  0.390000   0.070000   0.460000 (  0.468864)
  0.390000   0.070000   0.460000 (  0.465009)
  0.390000   0.060000   0.450000 (  0.452902)
  0.390000   0.070000   0.460000 (  0.466881)
Mean: 0.482838 SDev: 0.024926

$ jruby -J-server to_xs.rb 
  0.882000   0.000000   0.882000 (  0.883000)
  0.832000   0.000000   0.832000 (  0.832000)
  0.851000   0.000000   0.851000 (  0.850000)
  0.837000   0.000000   0.837000 (  0.837000)
  0.846000   0.000000   0.846000 (  0.846000)
  0.843000   0.000000   0.843000 (  0.843000)
  0.835000   0.000000   0.835000 (  0.835000)
  0.825000   0.000000   0.825000 (  0.826000)
  0.830000   0.000000   0.830000 (  0.830000)
  0.834000   0.000000   0.834000 (  0.833000)
Mean: 0.841500 SDev: 0.016379

Tags , , ,  | 3 comments

Comments

  1. Avatar Sam Ruby said about 1 hour later:

    Chuckle. :-)

    You might take a look at how I completely eliminated to_xs in my patch for Ruby 1.9. Perhaps similar techniques could be used for JRuby, as the root cause is that 1.8 MRI doesn’t grok Unicode?

    http://intertwingly.net/blog/2008/01/04/Builder-on-1-9

  2. Avatar Nick said about 2 hours later:

    Indeed, you’re presciently a couple steps ahead of me, making the work a lot easier. Thanks again!

  3. Avatar Mike said 5 days later:

    Thank you for this interesting article, which was published here.