Next performance fix: Builder::XChar

Posted by Nick Sieger Thu, 17 Jan 2008 23:48:00 GMT

Next up in our performance series: Builder::XChar. (Another fine Sam Ruby production!) While this piece of code in the Builder library strikes me as perfectly fine, it also tends to slow down quite a bit with larger documents or chunks of text.

Our path to the bottleneck is as follows: ActiveRecord::Base#to_xml => Builder::XMLMarkup#text! => String#to_xs => Fixnum#xchr. Consider:

require 'rubygems'
gem 'activesupport'
require 'active_support'
require 'benchmark'

module Benchmark
  class << self
    def report(&block)
      n = 10
      times = (1..10).map do
        bm = measure(&block)
        puts bm
        bm
      end
      sum = times.inject(0) {|s,t| s + t.real}
      mean = sum / n
      sumsq = times.inject(0) {|s,t| s + t.real * t.real}
      sd = Math.sqrt((sumsq - (sum * sum / n)) / (n - 1))
      puts("Mean: %0.6f SDev: %0.6f" % [mean, sd])
    end
  end
end

# http://blog.nicksieger.com/files/page.xml
page = File.open("page.xml") {|f| f.read }

Benchmark.report do
  20.times { page.to_xs }
end

On Ruby and JRuby, this produces:

$ ruby to_xs.rb 
 21.430000   0.400000  21.830000 ( 22.022769)
 21.530000   0.360000  21.890000 ( 22.005737)
 21.540000   0.370000  21.910000 ( 22.065165)
 21.530000   0.370000  21.900000 ( 22.028591)
 21.500000   0.350000  21.850000 ( 21.990395)
 21.550000   0.370000  21.920000 ( 22.033164)
 21.520000   0.360000  21.880000 ( 21.984129)
 21.550000   0.370000  21.920000 ( 22.116802)
 21.550000   0.370000  21.920000 ( 22.051421)
 21.520000   0.380000  21.900000 ( 22.084736)
Mean: 22.038291 SDev: 0.041985

$ jruby -J-server to_xs.rb
 79.112000   0.000000  79.112000 ( 79.112000)
 81.480000   0.000000  81.480000 ( 81.481000)
 84.745000   0.000000  84.745000 ( 84.745000)
 84.384000   0.000000  84.384000 ( 84.384000)
121.933000   0.000000 121.933000 (121.933000)
 85.533000   0.000000  85.533000 ( 85.532000)
 82.762000   0.000000  82.762000 ( 82.763000)
 82.090000   0.000000  82.090000 ( 82.090000)
 81.298000   0.000000  81.298000 ( 81.299000)
 80.774000   0.000000  80.774000 ( 80.773000)
Mean: 86.411200 SDev: 12.635700

(Hmm, I must have accidentally swapped in some large program in the middle of that JRuby run. The perils of benchmarking on a desktop machine. I don’t claim that the numbers are scientific, just illustrative!)

Fortunately, the fix again is very simple, and has previously been acknowledged. The latest (unreleased?) Hpricot has a new native extension, fast_xs, which is an almost drop-in replacement for the pure-ruby String#to_xs. (Almost, because it creates the method String#fast_xs instead of String#to_xs. ActiveSupport 2.0.2 and later take care of aliasing it for you). Unbeknownst to me, I ported fast_xs recently as part of upgrading JRuby extensions that have Java code in them. And so it happens to come in handy at this time. The patch for that is here.

I have the latest Hpricot gems on my server, so you can install it yourself (for either Ruby or JRuby):

gem install hpricot --source http://caldersphere.net

or

jruby -S gem install hpricot --source http://caldersphere.net

With that installed, the script now produces these results:

$ ruby to_xs.rb
  0.460000   0.080000   0.540000 (  0.537793)
  0.420000   0.070000   0.490000 (  0.501965)
  0.430000   0.070000   0.500000 (  0.501359)
  0.400000   0.070000   0.470000 (  0.484495)
  0.400000   0.070000   0.470000 (  0.479995)
  0.400000   0.070000   0.470000 (  0.469118)
  0.390000   0.070000   0.460000 (  0.468864)
  0.390000   0.070000   0.460000 (  0.465009)
  0.390000   0.060000   0.450000 (  0.452902)
  0.390000   0.070000   0.460000 (  0.466881)
Mean: 0.482838 SDev: 0.024926

$ jruby -J-server to_xs.rb 
  0.882000   0.000000   0.882000 (  0.883000)
  0.832000   0.000000   0.832000 (  0.832000)
  0.851000   0.000000   0.851000 (  0.850000)
  0.837000   0.000000   0.837000 (  0.837000)
  0.846000   0.000000   0.846000 (  0.846000)
  0.843000   0.000000   0.843000 (  0.843000)
  0.835000   0.000000   0.835000 (  0.835000)
  0.825000   0.000000   0.825000 (  0.826000)
  0.830000   0.000000   0.830000 (  0.830000)
  0.834000   0.000000   0.834000 (  0.833000)
Mean: 0.841500 SDev: 0.016379

Tags , , ,  | 3 comments

REXML a Drag...Again

Posted by Nick Sieger Thu, 17 Jan 2008 04:07:00 GMT

We’ve been here before. So here’s the scenario: You’re feeding medium-to-large chunks of XML out of one Rails app, to be consumed by another via ActiveResource. Maybe those chunks have embedded HTML, or maybe they’re an Atom feed containing several pieces of HTML with all the entities escaped. Maybe they contain entire Wikipedia pages in them. Lots of entities that need expansion when the file is parsed.

So what does ActiveResource do with this? Hash.from_xml. Which uses xml-simple. Which constructs a REXML::Document, and proceeds to navigate the entire DOM, scraping the text nodes out of it so they can be stuffed in a hash to be handed back to ActiveResource. And how does REXML expand all the entities it runs across? With this little lovely:

# Unescapes all possible entities
def Text::unnormalize( string, doctype=nil, filter=nil, illegal=nil )
  rv = string.clone
  rv.gsub!( /\r\n?/, "\n" )
  matches = rv.scan( REFERENCE )
  return rv if matches.size == 0
  rv.gsub!( NUMERICENTITY ) {|m|
    m=$1
    m = "0#{m}" if m[0] == ?x
    [Integer(m)].pack('U*')
  }
  matches.collect!{|x|x[0]}.compact!
  if matches.size > 0
    if doctype
      matches.each do |entity_reference|
        unless filter and filter.include?(entity_reference)
          entity_value = doctype.entity( entity_reference )
          re = /&#{entity_reference};/
          rv.gsub!( re, entity_value ) if entity_value
        end
      end
    else
      matches.each do |entity_reference|
        unless filter and filter.include?(entity_reference)
          entity_value = DocType::DEFAULT_ENTITIES[ entity_reference ]
          re = /&#{entity_reference};/
          rv.gsub!( re, entity_value.value ) if entity_value
        end
      end
    end
    rv.gsub!( /&amp;/, '&' )
  end
  rv
end

Now, when you look at this, your first impression is that it just screams fast, right? Let’s run Hash.from_xml on the file I mentioned above.

# unnormalize.rb
require 'rubygems'
gem 'activesupport'
require 'active_support'

File.open("page.xml") do |f|
  Hash.from_xml(f.read)
end
$ time ruby unnormalize.rb

real    0m16.221s
user    0m14.447s
sys     0m0.346s

Whoa! Knock me over with a feather! It blows chunks! You mean calling #gsub! repeatedly in a loop with dregexps (regexp literals with interpolated strings) doesn’t go fast? It’s doubly worse on JRuby, too:

$ time jruby unnormalize.rb

real    0m33.637s
user    0m32.897s
sys     0m0.573s

All this on a paltry 393K xml file. Makes me wonder how anyone ever does any serious XML processing in Ruby.

I know, this is open source, I should be whipping up a patch for this and submitting it. Well, I did cook up a solution, but it unfortunately is only available for JRuby at the moment. (I also have much more faith in Sam Ruby than myself to get the semantics of a rewritten REXML::Text::unnormalize correct.)

A while back I cooked up JREXML because Regexp processing in JRuby was slow at the time, and the guts of REXML is driven by a Regexp-based parser. JREXML swaps out that regexp parser with a Java pull parser library, and at the time it provided a modest speedup.

So, in the context of JREXML, the solution now becomes simple, by taking advantage of the fact that Java XML parsers typically expand entities for you. The just-released JREXML 0.5.3 circumvents REXML::Text::unnormalize when constructing a document from the Java-based parser. And the results again don’t disappoint:

$ time jruby unnormalize_jrexml.rb

real    0m5.802s
user    0m5.315s
sys     0m0.345s

Update: At Sam’s request, I ran the numbers again with REXML trunk, which condenses entity expansion into a single gsub. Speed is more in line for MRI, but didn’t move much for JRuby (probably more a datapoint for JRuby developers than REXML developers).

$ time ruby -I~/Projects/ruby/rexml/src unnormalize.rb 

real    0m6.592s
user    0m0.845s
sys     0m0.345s

$ time jruby -I~/Projects/ruby/rexml/src unnormalize.rb

real    0m34.353s
user    0m33.023s
sys     0m0.714s

Tags , ,  | 6 comments

JRuby 1.0.3: No Java-based extension library backward compatibility

Posted by Nick Sieger Wed, 19 Dec 2007 04:14:00 GMT

JRuby 1.0.3 just came out a couple of days ago. It was a decent point release; a handful of good bugs fixed. Normally a 1.0.3 release would not be all that exciting, but during this cycle, trunk’s internal API (upon which several JRuby extensions depend) started to diverge. Unfortunately, this forced us to face a decision: either fork and maintain two versions of every extension (one for 1.0.x and one for 1.1 and beyond), or break backwards compatibility.

We ended up choosing the latter, prefering a single schism to parallel version hell. It’s probably going to cause some pain for us (in number of support inquiries), and especially for those who might be looking casually at JRuby and trying it for the first time, for example via NetBeans. NetBeans 6.0 recently shipped with JRuby 1.0.2, which is now incompatible with the latest versions of several high-demand gems. Look for the 6.1 nightly builds to be fixed soon, and hopefully the 6.0.1 update can include the new release as well. (If you’re using NetBeans 6 and have run into this problem, you can download and unpack JRuby 1.0.3 and show NetBeans where it is.)

So when in doubt, grab the most recent JRuby release possible to minimize compatibility issues. To attempt to be as clear as possible about which versions work with what, I’ve included a table below. I’ll fill in with updates as I receive them, and let me know if a piece of software you use isn’t mentioned, but should be.

 JRuby Version
 1.0 - 1.0.2, 1.1b1 1.0.3, 1.1b2
Library 
rubygems<= 0.9.4<= 0.9.4, = 1.0 *
rails<= 1.2.6,
>= 2.0.x †
any
activerecord-jdbc<= 0.6>= 0.7
jruby-openssl<= 0.0.5>= 0.1
goldspike1.31.4
mongrelany ‡1.1.2

* Rubygems 0.9.5 may not be compatible with any JRuby version; we won’t ship it with a release
† requires jruby-openssl (0.0.5 or earlier) to be installed
‡ combination needs testing with JRuby 1.0.2 and Mongrel 1.1.2

Other libraries not mentioned here, such as javasand (JRuby version of freaky freaky sandbox) or jparsetree (JRuby version of ParseTree) will also likely need updating for 1.0.3 and 1.1. For library authors needing a hint for which way to go, here are some pointers to our temporary bridge API.

Lessons learned? An extension API and migration strategy might be normally be a good thing to nail down before a 1.0 release. Hopefully, you’ll forgive us that blunder this one time, and we’ll make sure to get this mess cleaned up in a future 1.x release, and any pains you had to go through with version incompatibilities will be soothed by the continual high-quality releases we’ve been able to craft.

Tags  | 4 comments

ActiveRecord-JDBC 0.6 Released!

Posted by Nick Sieger Tue, 06 Nov 2007 15:00:00 GMT

Just out is ActiveRecord-JDBC 0.6, the post-RubyConf release.

The sparkly new feature is Rails 2.0 support. In the soon-to-be-released Rails 2.0 (edge), Rails will automatically look for and load an adapter gem based on the name of the adapter you specify in database.yml. Example:

development:
  adapter: funkdb
  ...

With this database configuration, Rails will attempt to load the activerecord-funkdb-adapter gem, require the active_record/connection_adapters/funkdb_adapter library, and call the method ActiveRecord::Base.funkdb_connection in order to obtain a connection to the database. (This is the mechanism used to off-load non-core adapters out of the Rails codebase.)

We can leverage this convention to make it easier than ever to get started using JRuby with your Rails application. So, the first thing new in the 0.6 release is the name. You now install activerecord-jdbc-adapter:

jruby -S gem install activerecord-jdbc-adapter

But wait, there’s more! We also have adapters for four open-source databases, including MySQL, PostgreSQL, and two embedded Java databases, Derby and HSQLDB. And, for your convenience, we’ve bundled the JDBC drivers in dependent gems, so you don’t have to go hunting them down if you don’t have them handy.

Check this out. Get a fresh copy of JRuby 1.0.2, unpack it, and add the bin directory to your path. Install the adapter:

$ jruby -S gem install activerecord-jdbcderby-adapter --include-dependencies
Successfully installed activerecord-jdbcderby-adapter-0.6
Successfully installed activerecord-jdbc-adapter-0.6
Successfully installed jdbc-derby-10.2.2.0

In your Rails application, freeze to edge Rails (soon to be Rails 2.0).

rake rails:freeze:edge

Re-run the Rails command, regenerating configuration files.

jruby ./vendor/rails/railties/bin/rails .

Currently, Rails 2.0 uses openssl for the HMAC digest used in the new cookie session store, so we have to install the jruby-openssl gem:

jruby -S gem install jruby-openssl

Now, update your config/database.yml as follows:

development:
  adapter: jdbcderby
  database: db/development

Re-run your migrations, and you should now see a Derby database footprint in the db/development directory.

$ ls -l db/development
total 24
-rw-r--r--    1 nicksieg  nicksieg    38 Nov  6 08:24 db.lck
-rw-r--r--    1 nicksieg  nicksieg     4 Nov  6 08:24 dbex.lck
drwxr-xr-x    5 nicksieg  nicksieg   170 Nov  6 08:24 log/
drwxr-xr-x   65 nicksieg  nicksieg  2210 Nov  6 08:24 seg0/
-rw-r--r--    1 nicksieg  nicksieg   882 Nov  6 08:24 service.properties
drwxr-xr-x    2 nicksieg  nicksieg    68 Nov  6 08:24 tmp/

That’s it! To re-emphasize, to make your application run under JRuby, no longer will you need to a) find and download appropriate JDBC drivers, b) wonder where they should be placed so that JRuby will find them, or c) make custom changes to config/environment.rb. All that’s taken care of you if you use one of the following adapters:

  • activerecord-jdbcmysql-adapter (MySQL)
  • activerecord-jdbcpostgresql-adapter (PostgreSQL)
  • activerecord-jdbcderby-adapter (Derby)
  • activerecord-jdbchsqldb-adapter (HSQLDB)

If you need to connect to a different database, you’ll still need to place your database’s JDBC driver jar file in the appropriate place and use the straight activerecord-jdbc-adapter. Also note that in this case, and for Rails 1.2.x in general, you’ll still need to add that pesky require statement to config/environment.rb.

As always, there are bug fixes too (though we haven’t been tracking exactly which ones are fixed). We’re starting to file ActiveRecord-JDBC bugs in the JRuby JIRA now, and will be putting in future AR-JDBC versions to target soon too. So, please file new bugs in JIRA (and select component “ActiveRecord-JDBC”) rather than in the antiquated Rubyforge tracker.

Tags , ,  | 9 comments

RubyConf: Parting Thoughts

Posted by Nick Sieger Mon, 05 Nov 2007 17:57:34 GMT

RubyConf once again was thoroughly enjoyable. I highly recommend it to any Rubyist who is on the fence about attending to make it a priority to go next year. Here are some quick, random notes that didn’t quite fit into a full post.

  • For those of you who stopped by expecting to see the blow-by-blow of every minute of the conference like last year, my apologies. I think I set the bar a little too high for myself. It takes a lot of energy to stay focused on the sessions for the whole day. Perhaps it’s appropriate to pass the baton on to James Avery or Eric Mill for their 2007 coverage.
  • Venue (Omni Hotel Charlotte): Generally speaking, thumbs up. There were a couple of annoyances, though. 1. No non-emergency staircase to get to your room, causing huge lines for the elevators at the end of the afternoon. 2. Coffee was removed from the scene before 10 am, raising speculation that it was a conspiracy to drive business to the Starbucks in the mall below. 3. Toasters blew out the sound system on Sunday morning, forcing a PA system to be brought out and throwing a wrench in the rhythm of the morning talks.
  • I have to give props to Dr. Nic for avoiding getting burnt by the toaster incident and handling it really well. To boot, he gave one of the most entertaining talks at the conference, as the RubiGen video is sure to become an instant conference classic much like Adam Keys’ one-man-one-act event from last year.
  • Werewolf: I played one game, miserably. I was a werewolf, and when cornered by another in the game, mustered up the quote “I’m not an aggressive player, I prefer to feed off of other people.” Wow, what a freudian slip. While I can sympathize with Charlie’s comments about the game (and I do really enjoy late-night hackfests), I also have to agree with Chad and the other commenters that the two are not mutually exclusive, and the Werewolf games are wonderfully inclusive of RubyConf newbies and veterans alike.
  • The two-track approach in the afternoon this year seemed to go well, despite making it impossible to see all the talks. I would have liked to have seen Erik Hatcher’s Solr talk, but instead decided to give moral support to Kyle Maxwell’s JRuby in the Wild talk. I also missed the Saturday afternoon tracks to hang out in Stu’s Refactotum session.
  • Lots of good quotables: check out Nihilist and Twitter for some of the back-channel chatter.

See you next year!

Tags ,  | no comments

RubyConf Day 3: Behaviour-Driven Development with RSpec

Posted by Nick Sieger Sun, 04 Nov 2007 16:26:00 GMT

David Chelimsky and Dave Astels: RSpec

describe TestDriverDevelopment do
  it "is an incremental process"
  it "drives the implementation" 
  it "results in an exhaustive test suite"
  # but also...
  it "should focus on design"
  it "should focus on documentation"
  it "should focus on behaviour"
end

class BehaviourDrivenDevelopment < TestDrivenDevelopment
  include FocusOnDesign
  include FocusOnDocumentation
  include FocusOnBehavior
end

When doing test-driven development:

  • Write your intent first. The smallest test you can that fails.
  • Next, write the implementation. The simplest thing that could possibly work.
  • Even though you may be tempted to think about additional edge cases, multiple requirements, etc., you should try to be disciplined and focus only on the immediate tests. Only after you’ve made one test fail, then pass, can you continue on to other tests.

RSpec history

Initially BDD was just a discussion among Aslak Hellesoy and Dan North in the ThoughtWorks London office. Dave Astels joined the conversation with a blog post stating that he thought these ideas could be easily implemented in Smalltalk or Ruby. Steven Baker jumped in with an initial implementation, and released RSpec 0.1. Later in 2006, maintenance was handed over to David Chelimsky. RSpec has evolved through a dog-fooding phase up to the present 1.0 product.

BDD is no longer just about “should instead of assert”, it’s evolving into a process. Emphasizing central concepts from extreme programming and domain-driven design, it’s moving toward focusing on customer stories and acceptance testing. It’s outside-in, starting at high levels of detail, rather than low-level like RSpec or Test::Unit.

Story Runner

Story Runner is a new feature intended for RSpec 1.1. Each story is supposed to capture a customer requirement in the following general template:

As a (role) ... I want to (some function) ... so that (some business value).

It uses a “Scenario ... Given ... When ... Then ...” format to express the high level stories. Scenarios are a series of given items, steps, and behaviour validations. Once the basic steps are established, they can be re-used. David even demonstrated a preview of an in-browser story runner that would allow the customer to play with the implementation and create new scenarios.

Pending

Pending is a nice way to mark specs as “in-progress”. You can either omit a block for your spec, or use pending inside the block to leave a placeholder to come back to.

describe Pending do
  it "doesn't need a block to be pending"
  it "could also be specified inside the block" do
    pending("TODO")
    this.should_not be_a_failure
  end
  it "could also use a block with pending, and you will be notified when it starts to succeed" do
    pending("TODO") do
      this.should_not be_a_failure
    end
  end
end

Behaviour-Driven Development in Ruby with RSpec is a new book David and Aslak are working on, due out early next year.

Update: David has posted his slides.

Tags , ,  | no comments

RubyConf Day 2: Morning Sessions

Posted by Nick Sieger Sun, 04 Nov 2007 02:12:00 GMT

John Lam: IronRuby

Why IronRuby? John started with RubyCLR, which was a bridge between two languages/environments (.NET CLR and Ruby). Last year he didn’t know he’d be uprooting his family from Toronto and moving to Seattle. Now he finds himself in Microsoft trying to make sense of his new position. He describes a number of higher level goals for himself and IronRuby at Microsoft.

Change or die. Involvement in open source can only go up, right? The challenge is that the company is already doing well, so it’s hard to convince middle management that anything should change.

Open source. To their credit, the IronRuby team appears to be on the leading edge of open source at Microsoft (c.f Microsoft Public License). They also had planned all along to take external contributions, and have in fact started to receive them

Rails. One of the key goals is to be true to the language, and that includes being able to Run Rails.

Performance. Use IronRuby as a testbed for DLR performance testing.

John is showing the REPL now (running under Mono actually), pointing out that “integer math is now supported” (apparently early on someone pointed out that subtraction didn’t work) and that CLR list types automatically appear like Ruby arrays.

Heavy DLR pitch ahead. Performance history, how the CLR used to be slow for dynamic languages, and how it’s better now.

John is running the Rubinius specs now, and showing only 373 out of 1030 failing. (It looked like he was running the core specs only.) Praise for the Rubinius team!

It’s possible to bind C# types to Ruby using annotations. Lots of C# code being shown, including a mess of generated code.

John also showed a XAML/Silverlight demo that was scripted by Ruby.

Charles Nutter and Thomas Enebo: JRuby

JRuby: “Not Just” JRuby for the JVM. I found it hard to take notes for this talk since I’m so close to it. Fortunately, their slides were pretty verbose and comprehensive, and hopefully will be posted shortly.

Evan Phoenix: Rubinius

Rubinius talk in roller derby mode. Ask questions early and often.

What is the end game of Rubinius (or JRuby, or IronRuby)? Total. World. Domination. For Ruby!

Rubinius is 3 things: form, function, and elbow grease. Ruby::Syntax, Ruby::Behavior, and Google.search("crazy cs papers").

Rapid fire CS Nerd attack mode coming. Generational collection, bytecode execution, stackless, bytecode represenation, .rba archives.

Who would rather program C than Ruby? Java? C#? (Only one guy raised his hand that he’d rather code C.)

Hard-hitting portion of the talk. The kernel, broken down.

  • 1.8

    • 84,516 lines of C
    • 0 lines of Ruby
  • 1.9

    • 128,786 lines of C
    • 0 lines of Ruby
  • IronRuby

    • 48,282 lines of C#
    • 0 lines of Ruby
  • JRuby

    • 114,507 lines of Java
    • 0 lines of Ruby*

(*Even though I got heckled for saying it, JRuby does actually have some code written in Ruby that’s not the standard library.)

  • Rubinius
    • 25,398 lines of C
    • 13,946 lines of Ruby

1.8 and 1.9 are really Ruby for C programmers. JRuby is Ruby for Java programmers. IronRuby is Ruby for C# programmers. But Rubinius is Ruby for Ruby programmers.

Dogfooding. Gives feedback, which enables tighter loops, improves the kernel, makes life better for everyone on the platform.

Road, rubber, all that jazz. Evan mentions that Rubinius runs 24 of 31 benchmarks faster than Ruby 1.8, but the numbers are shifting rapidly. Evan wanted a 1.0 for RubyConf, but he has come to realize that several things are more important than a milestone. Design, and the technical challenges, certainly. But more importantly, the community.

Taking a cue from the Perl 6 community, -Ofun. The free-flowing commit bit, where patch sumbitters whose patches are accepted are immediately entitled commit rights, has given rise to 57 committers. 17 of these have changed more than 400 lines of code.

Tags ,  | 3 comments

RubyConf Day 1: Morning Sessions

Posted by Nick Sieger Fri, 02 Nov 2007 15:35:00 GMT

Marcel Molina: What Makes Code Beautiful?

What is beauty? Marcel explores this topic, starting with posing the question to the audience. “My wife!” Marcel: Why is she beautiful? “Longer answer than you want!”

Marcel comes from a literature/linguistic background, and is interested in how meaning is conveyed, but even beyond the basic words themselves, but the context and expressivity as well.

Note: Marcel has given this talk before.

History of beauty

Pythagoras: was out in the street, heard the blacksmith’s clanging hammer, and was drawn to the noise. He recognized, through closer inspection, that the different sounds that came from the different hammers had relationships, and eventually saw similar relationships in other parts of nature, architecture, and so on.

Thomas Aquinas: Three things that define beauty: 1. Proportion. The economy of size and ratio of parts. The smallest thing that works. 2. Integrity. Well-suited for the purpose. 3. Clarity. Clear and simple.

Each of the qualities are necessary, but none are sufficent. For example proportion (economy) will often clash with clarity. This is especially true in code.

Applied to software

Case study: coercion. Converting XML strings into rich Ruby equivalents. Marcel’s initial solution was a CoercibleString < String, which used a generator to iteratively try to coerce XML attributes to a number of types, and return the results. ~20 lines of code to convert to 4 types. His second version was a simple class method on String with a case statement.

Kent Beck, in his book Smalltalk: Best Practice Patterns, writes a book about writing good software, but in Marcel’s opinion, arrives at a definition of beauty by describing aspects of code that reflect proportion, integrity, and clarity.

Niels Bohr: “An expert is a person who has made all the mistakes that can be made in a very narrow field.” Marcel calls his CoercibleString a mistake, but one that helped him learn more about coding.

Luckily for us, Ruby is optimized for beauty.

Jim Weirich: Advanced Ruby Class Design

Emphasizing “Ruby” more so than “Advanced”, through three examples that illustrate techniques not commonly found in statically-typed OO languages (Java/C++/Eiffel).

Rake::FileList

FileList['lib/**/*.rb']

FileList sports globbing, a specialized to_s, and lazy evaluation. First version: class FileList < Array; end. Good idea, right? Well, with lazy evaluation, resolution of filenames happens only when the list is accessed, not created, so a lot of methods need to be overloaded:

def [](index)
  resolve unless @resolved
  super
end

The problem becomes that FileList too closely mimics Array, and cannot distinguish itself in the case that matters. So it was changed to delegate to array rather than inherit.

Moral: when you want to mimic built-in classes, it might be better to implement #to_ary or #to_str rather than inherit.

Builder::XmlMarkup

What’s the problem here?

  b = Builder::XmlMarkup.new
  b.student do
    b.name "Jim"
    b.phone_number "555-1234"
    b.class "Intro to Ruby"
  end
 end

class is already a method on Object. This begat BlankSlate, which removes unnecessary methods from Object. Several techniques were applied to eventually arrive at the latest version:

  • Use undef_method to hide methods that we don’t want. Except, leave methods beginning with double-underscore alone (__id__ and __send__).
  • Catch new methods added via a method_added hook on Kernel, and an append_features hook on Object, to deal with methods defined and modules included after BlankSlate was created

TableNode

Problem: magic conversion of Rails conditions to SQL. An example: User.find(:all).select{|u| u.name == "jim"}. We don’t really want to load the entire database to do this, but we don’t like writing SQL either.

Solution: Record the actions in the select block by yielding a special TableNode object that captures the method calls and translates to SQL on the fly. Now we can write User.select {|u| u.name == "Jim"} and have it still execute SQL

  • Capture methods called and wrap in a MethodNode to convert to SQL column references
  • Capture operators and wrap in a BinaryOpNode to handle ==, <, etc.

Clever! Will this work? Here are some issues:

  • Small issue -- ordering: User.select {|u| "Jim" == u.name} will not work without messing with String#==.
  • Bigger issues: && and || are not override-able in Ruby. What’s worse, ! has pre-defined semantics (in the parser) and cannot be captured.

Lessons learned

  • Don’t be afraid to think beyond prior experiences to come up with new ways of solving problems in code.

Tags ,  | no comments

JRuby Performance Tweaks

Posted by Nick Sieger Thu, 25 Oct 2007 16:04:17 GMT

The back-story for my post on JRuby performance is that I was actually doing some benchmarking while applying successive tweaks to the JRuby environment to see how it affected our application performance. Only after running the final set of tweaks for longer numbers of requests did I notice that JRuby was catching up to MRI!

I thought it would be interesting to point out the tweaks themselves, to reiterate the point that JRuby and the JVM give you lots of knobs to tune your application. The notepad of tweaks, numbers and the script used to run them is here.

  1. I started by running the application with the “out of the box” setup: JRuby 1.0.1. I then started to apply tweaks to see how the numbers changed.
  2. The first tweak was to turn off ObjectSpace, which is pure overhead for JRuby.
  3. Next, I enabled JREXML since our application uses ActiveResource.
  4. Next, I tried enabling Charlie’s discovery in the rexml/source. It appears that the benefit was negated by JREXML, so I went back and ran another run with the rexml/source patch and without JREXML as 2a. 2a gave almost as much benefit by itself as JREXML, so that’s an option if you don’t want the additional dependency, but you should measure for yourself since the performance profiles of those two fixes for XML may differ depending on the amount of XML consumed. In our case it’s relatively small, a few kilobytes at most.
  5. Next, I switched to JRuby trunk instead of 1.0.1. Trunk has, among other improvements, a complete compiler, which should allow more of the application to be translated to Java bytecode.
  6. The last tweak for this study was to change to the “server” VM. The server VM is known to take longer to warm up, but is more aggressive in the optimizations it performs.

The beauty of this exercise is that all the changes made provided small performance boosts for the application. Going forward, we hope to make more of this baked-in behavior (ObjectSpace off by default, possibly the rexml/source fix), but it will still help to have knowledge of and play around with the hotspot settings for the JVM.

There are a few more things I’d like to try in the future: JDK 6 is reportedly a lot faster all by itself, and the standalone Glassfish gem may give Mongrel a run for his money. There is still plenty of work left for the impending JRuby 1.1 release, so we should see the performance improvements for Rails applications running on JRuby continue to roll in.

Tags  | no comments

JRuby on Rails: Fast Enough

Posted by Nick Sieger Thu, 25 Oct 2007 03:36:00 GMT

People have been asking for a while how fast JRuby runs Rails. (Of course, “fast” has always been a relative term.) We haven’t been quick to answer the question, because frankly we didn’t know. We hadn’t been building real Rails applications on JRuby ourselves yet, and there was no definitive word from the crowd either.

Recently, several guys from ThoughtWorks have been working on a Rails petstore application and benchmark to get to the heart of the matter. Discussion has been heated on the JRuby mailing list, but results have not been conclusive yet.

In the project I’m working on, we’ve committed to using and deploying on JRuby. Eventually we were going to reach the point where we’d need to find out how well our application runs. So today I began running a simple single request benchmark on a relatively busy page. The numbers turned out to be rather surprising:

Requests

Average

(The raw data is available here.)

Now, MRI (C Ruby) will always run about the same speed no matter how many runs you give it, but it’s well known that the JVM needs time to warm up. And indeed it does; after 250 iterations, Mongrel running on JRuby finally surpasses MRI. The JRuby/Goldspike/Glassfish combo comes close as well.

Some details about the setup:

  • I ran the tests on my MacBook Pro Core 2 Duo 2.4 GHz. I didn’t disable one of the cores for the tests, which means that JRuby has an advantage over MRI because it can use both (native threads at work). However, the test script ran the requests serially, which means that the advantage was minimal.
  • The application is indeed of the “hydra” variety; the setup is nearly identical to the second diagram on that page. So a single request is passing through not one, but two Rails applications in addition to touching the database. It rendered an HTML ERb view with data from an ActiveResource-accessed RESTful service. The applications are based on Rails 1.2.3.
  • MRI version is using Ruby 1.8.6 and Mongrel 1.0.1.
  • JRuby Mongrel is also version 1.0.1 (details on installing it here)
  • JRuby on Glassfish used Glassfish 2 and Goldspike 1.4, deployed in war files via Warbler.
  • The two JRuby setups used JDK 1.5 and were tweaked to disable ObjectSpace and use the “server” VM (-server argument to the JVM).

The main point I wish to make with these numbers is that JRuby performance is there today, and still has room to grow. There’s no longer any doubt in my mind. Yes, this is a simplistic application benchmark run on a developer’s machine, but it’s a real application. The test may not be exacting in precision, but I see enough in the numbers to believe that this will be replicable to production environments. The plot thickens!

Tags , ,  | 1 comment

Older posts: 1 ... 4 5 6 7 8 ... 17