Screencast: RSpec and NetBeans

Posted by Nick Sieger Fri, 08 Feb 2008 15:13:16 GMT

A new screen cast is up with yours truly showing off NetBeans’ RSpec support. Additionally, I tried to make it interesting to a wider audience by really showcasing RSpec’s strengths, and trying to capture some of the red-green-refactor rhythm. NetBeans does work really well for this, but in my mind, the star of the show is RSpec.

I’m pleased with how it turned out considering I hadn’t done this sort of thing before. Special thanks to Cindy Church for putting it all together, including all the production: setup, recording, editing, even the music!

A QuickTime movie version is available as well. Check it out and let me know what you think.

Tags , , ,  | 1 comment

Why DTrace Makes Leopard a Must-Have Upgrade

Posted by Nick Sieger Tue, 05 Feb 2008 21:12:00 GMT

I feel like I’m actually a relative late-comer to Leopard, at least in my social circle. A lot of the folks in the Ruby community already had it installed the week after it was out, and were showing it off at RubyConf back in November. I just didn’t have a compelling reason to upgrade and disrupt my workflow at the time. Plus, mixed reports were coming out about data loss, UI nits, and other instabilities.

By the time I went out to purchase, 10.5.1 was already the version boxed in the stores, and in retrospect, it seemed worth the wait. I haven’t had a single complaint or major issue with the upgrade so far, and have been enjoying the noticeable zippiness of a freshly-installed system.

Time Machine has been a widely-publicized feature, and has been touted as one of the top reasons to upgrade. So I bought a small portable drive with some leftover holiday gift cards and set out to try it. Initially it seemed promising, except after a day or two of backups the process would stall out during the “preparing” stage. Eventually I noticed that the TM background process, backupd, was eating up 0.5GB of memory and up to 100% of one of the CPUs.

If I wasn’t a nerd making my living having my way with computers, I probably would have given up on Time Machine at this point, after a couple of hours scouring Google and the Apple discussion boards searching for similar problems. But I knew that backupd had to be doing something pathological, and I was compelled to find out what.

On Solaris systems, truss is usually the order of the day for problems like this. It literally vomits an endless listing of system calls invoked by a process into your terminal window. Except there’s no truss on OS X. Is there a replacement? Google mentioned ktrace, present on Tiger systems and earlier, but it’s gone in Leopard. Replaced by? DTrace.

Ahhh, DTrace! Another geeky Leopard-only feature. Certainly DTrace will be able to trace system calls in the same manner as truss. But being a complete DTrace newb, I had no idea where to start. So, like any lazy programmer does, I started shopping around for examples to get me started. Looking around, this article on MacTech looked promising, but didn’t have what I needed. Eventually, I ended up finding the DTrace Toolkit on the OpenSolaris site.

The DTrace Toolkit appears to be your one-stop shop for all things DTrace. If you need a kick-start reason to take a look at DTrace and get you going, this is it. In my case, lo and behold, one of the scripts included in the toolkit is called dtruss!

Many of the scripts in the toolkit are tailored towards a Solaris system, and dtruss is a prime example. It won’t quite work out of the box on Leopard, because a few of the system calls mentioned in the script are non-existent there. Changing the shebang line at the top of the script to #!/bin/bash, and repeatedly running it a few times with sudo ./dtruss -p <pid> will give you an idea of which ones; I simply commented these out until I was successfully able to trace a process.

Now, finally we can pop the stack back to my original problem with Time Machine and backupd. I launched a backup run and waited for the process to start consuming large amounts of CPU and memory. I located the PID of the process in Activity Monitor, and started tracing it with my modified dtruss script. And, sure enough, I saw the following output scrolling by endlessly in my terminal:

mmap(0x0, 0x5000, 0x3)     = 958464 0
getdirentriesattr(..., ..., ...)    = ....
munmap(0xEA000, 0x5000)    = 0 0
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93

(The ellipsis were actual memory addresses, I didn’t save the output.) What was interesting is that the same chunk of memory (the first argument to getxattr) was floating by repeatedly. Looking at the man page for getxattr, the signature is:

ssize_t
getxattr(const char *path, const char *name, void *value,
         size_t size, u_int32_t position, int options);

So, the first argument contains the path. Now, how can I get the contents of that memory address? The answer is inside dtruss. Closer to the top of the script is this DTrace code:

/* print 3 args, arg0 as a string */
syscall::stat*:return, 
syscall::lstat*:return, 
syscall::open*:return
/* not on leopard -- syscall::resolvepath:return */
/self->start/
{
    /* calculate elapsed time */
    this->elapsed = timestamp - self->start;
    self->start = 0;
    this->cpu = vtimestamp - self->vstart;
    self->vstart = 0;
    self->code = errno == 0 ? "" : "Err#";

    /* print optional fields */
    OPT_printid  ? printf("%6d/%d:  ", pid, tid) : 1;
    OPT_relative ? printf("%8d ", vtimestamp/1000) : 1;
    OPT_elapsed  ? printf("%7d ", this->elapsed/1000) : 1;
    OPT_cpu      ? printf("%6d ", this->cpu/1000) : 1;

    /* print main data */
    printf("%s(\"%S\", 0x%X, 0x%X)\t\t = %d %s%d\n", probefunc,
        copyinstr(self->arg0), self->arg1, self->arg2, (int)arg0,
        self->code, (int)errno);
    OPT_stack ? ustack()    : 1;
    OPT_stack ? trace("\n") : 1;
    self->arg0 = 0;
    self->arg1 = 0;
    self->arg2 = 0;
}

I only had to add syscall::getxattr:return to the list of matched probes, and now I could finally inspect the path argument to getxattr:

munmap(0xEA000, 0x5000)      = 0 0
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002680.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002681.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002682.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002683.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002684.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002685.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002686.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002687.emlx\0",
0x967288D4, 0x0)         = -1 Err#93

D’oh! GMail! That directory has so many files, it took over two minutes just for ls to list all 487356 of them. Hundreds of thousands of email messages, all being re-inspected by Time Machine every time some new messages are added to the directory. I’ll leave it to someone else to point fingers at what the actual problem is here (a side-effect of TM’s usage of hard links? Mail.app inefficiently storing too many messages in a single directory?), but after all this I just decided that I didn’t want a backup of my GMail messages since they’re stored on the server. So I added the directory to the list of excluded directories in TM, wiped my backup, and started over. (TM had similar problems trying to complete an incremental backup with the existing backed-up copy of my mail on the backup disk, so I decided to wipe it and start fresh.) I won’t declare the problem completely solved yet, but if it happens again, I’ll just repeat this process to find the new culprit. Hopefully I don’t end up excluding my entire home directory!

This whole process was a revelation to me – the fact that I could pinpoint the exact problem in a piece of system software despite having few notions of the internals of that software. The next time I have a nagging issue in Leopard, DTrace will be my tool of choice in tracking it down. Let’s just hope it’s not a problem with iTunes!

Tags , ,  | 9 comments

Next performance fix: Builder::XChar

Posted by Nick Sieger Thu, 17 Jan 2008 23:48:00 GMT

Next up in our performance series: Builder::XChar. (Another fine Sam Ruby production!) While this piece of code in the Builder library strikes me as perfectly fine, it also tends to slow down quite a bit with larger documents or chunks of text.

Our path to the bottleneck is as follows: ActiveRecord::Base#to_xml => Builder::XMLMarkup#text! => String#to_xs => Fixnum#xchr. Consider:

require 'rubygems'
gem 'activesupport'
require 'active_support'
require 'benchmark'

module Benchmark
  class << self
    def report(&block)
      n = 10
      times = (1..10).map do
        bm = measure(&block)
        puts bm
        bm
      end
      sum = times.inject(0) {|s,t| s + t.real}
      mean = sum / n
      sumsq = times.inject(0) {|s,t| s + t.real * t.real}
      sd = Math.sqrt((sumsq - (sum * sum / n)) / (n - 1))
      puts("Mean: %0.6f SDev: %0.6f" % [mean, sd])
    end
  end
end

# http://blog.nicksieger.com/files/page.xml
page = File.open("page.xml") {|f| f.read }

Benchmark.report do
  20.times { page.to_xs }
end

On Ruby and JRuby, this produces:

$ ruby to_xs.rb 
 21.430000   0.400000  21.830000 ( 22.022769)
 21.530000   0.360000  21.890000 ( 22.005737)
 21.540000   0.370000  21.910000 ( 22.065165)
 21.530000   0.370000  21.900000 ( 22.028591)
 21.500000   0.350000  21.850000 ( 21.990395)
 21.550000   0.370000  21.920000 ( 22.033164)
 21.520000   0.360000  21.880000 ( 21.984129)
 21.550000   0.370000  21.920000 ( 22.116802)
 21.550000   0.370000  21.920000 ( 22.051421)
 21.520000   0.380000  21.900000 ( 22.084736)
Mean: 22.038291 SDev: 0.041985

$ jruby -J-server to_xs.rb
 79.112000   0.000000  79.112000 ( 79.112000)
 81.480000   0.000000  81.480000 ( 81.481000)
 84.745000   0.000000  84.745000 ( 84.745000)
 84.384000   0.000000  84.384000 ( 84.384000)
121.933000   0.000000 121.933000 (121.933000)
 85.533000   0.000000  85.533000 ( 85.532000)
 82.762000   0.000000  82.762000 ( 82.763000)
 82.090000   0.000000  82.090000 ( 82.090000)
 81.298000   0.000000  81.298000 ( 81.299000)
 80.774000   0.000000  80.774000 ( 80.773000)
Mean: 86.411200 SDev: 12.635700

(Hmm, I must have accidentally swapped in some large program in the middle of that JRuby run. The perils of benchmarking on a desktop machine. I don’t claim that the numbers are scientific, just illustrative!)

Fortunately, the fix again is very simple, and has previously been acknowledged. The latest (unreleased?) Hpricot has a new native extension, fast_xs, which is an almost drop-in replacement for the pure-ruby String#to_xs. (Almost, because it creates the method String#fast_xs instead of String#to_xs. ActiveSupport 2.0.2 and later take care of aliasing it for you). Unbeknownst to me, I ported fast_xs recently as part of upgrading JRuby extensions that have Java code in them. And so it happens to come in handy at this time. The patch for that is here.

I have the latest Hpricot gems on my server, so you can install it yourself (for either Ruby or JRuby):

gem install hpricot --source http://caldersphere.net

or

jruby -S gem install hpricot --source http://caldersphere.net

With that installed, the script now produces these results:

$ ruby to_xs.rb
  0.460000   0.080000   0.540000 (  0.537793)
  0.420000   0.070000   0.490000 (  0.501965)
  0.430000   0.070000   0.500000 (  0.501359)
  0.400000   0.070000   0.470000 (  0.484495)
  0.400000   0.070000   0.470000 (  0.479995)
  0.400000   0.070000   0.470000 (  0.469118)
  0.390000   0.070000   0.460000 (  0.468864)
  0.390000   0.070000   0.460000 (  0.465009)
  0.390000   0.060000   0.450000 (  0.452902)
  0.390000   0.070000   0.460000 (  0.466881)
Mean: 0.482838 SDev: 0.024926

$ jruby -J-server to_xs.rb 
  0.882000   0.000000   0.882000 (  0.883000)
  0.832000   0.000000   0.832000 (  0.832000)
  0.851000   0.000000   0.851000 (  0.850000)
  0.837000   0.000000   0.837000 (  0.837000)
  0.846000   0.000000   0.846000 (  0.846000)
  0.843000   0.000000   0.843000 (  0.843000)
  0.835000   0.000000   0.835000 (  0.835000)
  0.825000   0.000000   0.825000 (  0.826000)
  0.830000   0.000000   0.830000 (  0.830000)
  0.834000   0.000000   0.834000 (  0.833000)
Mean: 0.841500 SDev: 0.016379

Tags , , ,  | 3 comments

REXML a Drag...Again

Posted by Nick Sieger Thu, 17 Jan 2008 04:07:00 GMT

We’ve been here before. So here’s the scenario: You’re feeding medium-to-large chunks of XML out of one Rails app, to be consumed by another via ActiveResource. Maybe those chunks have embedded HTML, or maybe they’re an Atom feed containing several pieces of HTML with all the entities escaped. Maybe they contain entire Wikipedia pages in them. Lots of entities that need expansion when the file is parsed.

So what does ActiveResource do with this? Hash.from_xml. Which uses xml-simple. Which constructs a REXML::Document, and proceeds to navigate the entire DOM, scraping the text nodes out of it so they can be stuffed in a hash to be handed back to ActiveResource. And how does REXML expand all the entities it runs across? With this little lovely:

# Unescapes all possible entities
def Text::unnormalize( string, doctype=nil, filter=nil, illegal=nil )
  rv = string.clone
  rv.gsub!( /\r\n?/, "\n" )
  matches = rv.scan( REFERENCE )
  return rv if matches.size == 0
  rv.gsub!( NUMERICENTITY ) {|m|
    m=$1
    m = "0#{m}" if m[0] == ?x
    [Integer(m)].pack('U*')
  }
  matches.collect!{|x|x[0]}.compact!
  if matches.size > 0
    if doctype
      matches.each do |entity_reference|
        unless filter and filter.include?(entity_reference)
          entity_value = doctype.entity( entity_reference )
          re = /&#{entity_reference};/
          rv.gsub!( re, entity_value ) if entity_value
        end
      end
    else
      matches.each do |entity_reference|
        unless filter and filter.include?(entity_reference)
          entity_value = DocType::DEFAULT_ENTITIES[ entity_reference ]
          re = /&#{entity_reference};/
          rv.gsub!( re, entity_value.value ) if entity_value
        end
      end
    end
    rv.gsub!( /&amp;/, '&' )
  end
  rv
end

Now, when you look at this, your first impression is that it just screams fast, right? Let’s run Hash.from_xml on the file I mentioned above.

# unnormalize.rb
require 'rubygems'
gem 'activesupport'
require 'active_support'

File.open("page.xml") do |f|
  Hash.from_xml(f.read)
end
$ time ruby unnormalize.rb

real    0m16.221s
user    0m14.447s
sys     0m0.346s

Whoa! Knock me over with a feather! It blows chunks! You mean calling #gsub! repeatedly in a loop with dregexps (regexp literals with interpolated strings) doesn’t go fast? It’s doubly worse on JRuby, too:

$ time jruby unnormalize.rb

real    0m33.637s
user    0m32.897s
sys     0m0.573s

All this on a paltry 393K xml file. Makes me wonder how anyone ever does any serious XML processing in Ruby.

I know, this is open source, I should be whipping up a patch for this and submitting it. Well, I did cook up a solution, but it unfortunately is only available for JRuby at the moment. (I also have much more faith in Sam Ruby than myself to get the semantics of a rewritten REXML::Text::unnormalize correct.)

A while back I cooked up JREXML because Regexp processing in JRuby was slow at the time, and the guts of REXML is driven by a Regexp-based parser. JREXML swaps out that regexp parser with a Java pull parser library, and at the time it provided a modest speedup.

So, in the context of JREXML, the solution now becomes simple, by taking advantage of the fact that Java XML parsers typically expand entities for you. The just-released JREXML 0.5.3 circumvents REXML::Text::unnormalize when constructing a document from the Java-based parser. And the results again don’t disappoint:

$ time jruby unnormalize_jrexml.rb

real    0m5.802s
user    0m5.315s
sys     0m0.345s

Update: At Sam’s request, I ran the numbers again with REXML trunk, which condenses entity expansion into a single gsub. Speed is more in line for MRI, but didn’t move much for JRuby (probably more a datapoint for JRuby developers than REXML developers).

$ time ruby -I~/Projects/ruby/rexml/src unnormalize.rb 

real    0m6.592s
user    0m0.845s
sys     0m0.345s

$ time jruby -I~/Projects/ruby/rexml/src unnormalize.rb

real    0m34.353s
user    0m33.023s
sys     0m0.714s

Tags , ,  | 6 comments

JRuby 1.0.3: No Java-based extension library backward compatibility

Posted by Nick Sieger Wed, 19 Dec 2007 04:14:00 GMT

JRuby 1.0.3 just came out a couple of days ago. It was a decent point release; a handful of good bugs fixed. Normally a 1.0.3 release would not be all that exciting, but during this cycle, trunk’s internal API (upon which several JRuby extensions depend) started to diverge. Unfortunately, this forced us to face a decision: either fork and maintain two versions of every extension (one for 1.0.x and one for 1.1 and beyond), or break backwards compatibility.

We ended up choosing the latter, prefering a single schism to parallel version hell. It’s probably going to cause some pain for us (in number of support inquiries), and especially for those who might be looking casually at JRuby and trying it for the first time, for example via NetBeans. NetBeans 6.0 recently shipped with JRuby 1.0.2, which is now incompatible with the latest versions of several high-demand gems. Look for the 6.1 nightly builds to be fixed soon, and hopefully the 6.0.1 update can include the new release as well. (If you’re using NetBeans 6 and have run into this problem, you can download and unpack JRuby 1.0.3 and show NetBeans where it is.)

So when in doubt, grab the most recent JRuby release possible to minimize compatibility issues. To attempt to be as clear as possible about which versions work with what, I’ve included a table below. I’ll fill in with updates as I receive them, and let me know if a piece of software you use isn’t mentioned, but should be.

 JRuby Version
 1.0 - 1.0.2, 1.1b1 1.0.3, 1.1b2
Library 
rubygems<= 0.9.4<= 0.9.4, = 1.0 *
rails<= 1.2.6,
>= 2.0.x †
any
activerecord-jdbc<= 0.6>= 0.7
jruby-openssl<= 0.0.5>= 0.1
goldspike1.31.4
mongrelany ‡1.1.2

* Rubygems 0.9.5 may not be compatible with any JRuby version; we won’t ship it with a release
† requires jruby-openssl (0.0.5 or earlier) to be installed
‡ combination needs testing with JRuby 1.0.2 and Mongrel 1.1.2

Other libraries not mentioned here, such as javasand (JRuby version of freaky freaky sandbox) or jparsetree (JRuby version of ParseTree) will also likely need updating for 1.0.3 and 1.1. For library authors needing a hint for which way to go, here are some pointers to our temporary bridge API.

Lessons learned? An extension API and migration strategy might be normally be a good thing to nail down before a 1.0 release. Hopefully, you’ll forgive us that blunder this one time, and we’ll make sure to get this mess cleaned up in a future 1.x release, and any pains you had to go through with version incompatibilities will be soothed by the continual high-quality releases we’ve been able to craft.

Tags  | 4 comments

ActiveRecord-JDBC 0.6 Released!

Posted by Nick Sieger Tue, 06 Nov 2007 15:00:00 GMT

Just out is ActiveRecord-JDBC 0.6, the post-RubyConf release.

The sparkly new feature is Rails 2.0 support. In the soon-to-be-released Rails 2.0 (edge), Rails will automatically look for and load an adapter gem based on the name of the adapter you specify in database.yml. Example:

development:
  adapter: funkdb
  ...

With this database configuration, Rails will attempt to load the activerecord-funkdb-adapter gem, require the active_record/connection_adapters/funkdb_adapter library, and call the method ActiveRecord::Base.funkdb_connection in order to obtain a connection to the database. (This is the mechanism used to off-load non-core adapters out of the Rails codebase.)

We can leverage this convention to make it easier than ever to get started using JRuby with your Rails application. So, the first thing new in the 0.6 release is the name. You now install activerecord-jdbc-adapter:

jruby -S gem install activerecord-jdbc-adapter

But wait, there’s more! We also have adapters for four open-source databases, including MySQL, PostgreSQL, and two embedded Java databases, Derby and HSQLDB. And, for your convenience, we’ve bundled the JDBC drivers in dependent gems, so you don’t have to go hunting them down if you don’t have them handy.

Check this out. Get a fresh copy of JRuby 1.0.2, unpack it, and add the bin directory to your path. Install the adapter:

$ jruby -S gem install activerecord-jdbcderby-adapter --include-dependencies
Successfully installed activerecord-jdbcderby-adapter-0.6
Successfully installed activerecord-jdbc-adapter-0.6
Successfully installed jdbc-derby-10.2.2.0

In your Rails application, freeze to edge Rails (soon to be Rails 2.0).

rake rails:freeze:edge

Re-run the Rails command, regenerating configuration files.

jruby ./vendor/rails/railties/bin/rails .

Currently, Rails 2.0 uses openssl for the HMAC digest used in the new cookie session store, so we have to install the jruby-openssl gem:

jruby -S gem install jruby-openssl

Now, update your config/database.yml as follows:

development:
  adapter: jdbcderby
  database: db/development

Re-run your migrations, and you should now see a Derby database footprint in the db/development directory.

$ ls -l db/development
total 24
-rw-r--r--    1 nicksieg  nicksieg    38 Nov  6 08:24 db.lck
-rw-r--r--    1 nicksieg  nicksieg     4 Nov  6 08:24 dbex.lck
drwxr-xr-x    5 nicksieg  nicksieg   170 Nov  6 08:24 log/
drwxr-xr-x   65 nicksieg  nicksieg  2210 Nov  6 08:24 seg0/
-rw-r--r--    1 nicksieg  nicksieg   882 Nov  6 08:24 service.properties
drwxr-xr-x    2 nicksieg  nicksieg    68 Nov  6 08:24 tmp/

That’s it! To re-emphasize, to make your application run under JRuby, no longer will you need to a) find and download appropriate JDBC drivers, b) wonder where they should be placed so that JRuby will find them, or c) make custom changes to config/environment.rb. All that’s taken care of you if you use one of the following adapters:

  • activerecord-jdbcmysql-adapter (MySQL)
  • activerecord-jdbcpostgresql-adapter (PostgreSQL)
  • activerecord-jdbcderby-adapter (Derby)
  • activerecord-jdbchsqldb-adapter (HSQLDB)

If you need to connect to a different database, you’ll still need to place your database’s JDBC driver jar file in the appropriate place and use the straight activerecord-jdbc-adapter. Also note that in this case, and for Rails 1.2.x in general, you’ll still need to add that pesky require statement to config/environment.rb.

As always, there are bug fixes too (though we haven’t been tracking exactly which ones are fixed). We’re starting to file ActiveRecord-JDBC bugs in the JRuby JIRA now, and will be putting in future AR-JDBC versions to target soon too. So, please file new bugs in JIRA (and select component “ActiveRecord-JDBC”) rather than in the antiquated Rubyforge tracker.

Tags , ,  | 9 comments

RubyConf: Parting Thoughts

Posted by Nick Sieger Mon, 05 Nov 2007 17:57:34 GMT

RubyConf once again was thoroughly enjoyable. I highly recommend it to any Rubyist who is on the fence about attending to make it a priority to go next year. Here are some quick, random notes that didn’t quite fit into a full post.

  • For those of you who stopped by expecting to see the blow-by-blow of every minute of the conference like last year, my apologies. I think I set the bar a little too high for myself. It takes a lot of energy to stay focused on the sessions for the whole day. Perhaps it’s appropriate to pass the baton on to James Avery or Eric Mill for their 2007 coverage.
  • Venue (Omni Hotel Charlotte): Generally speaking, thumbs up. There were a couple of annoyances, though. 1. No non-emergency staircase to get to your room, causing huge lines for the elevators at the end of the afternoon. 2. Coffee was removed from the scene before 10 am, raising speculation that it was a conspiracy to drive business to the Starbucks in the mall below. 3. Toasters blew out the sound system on Sunday morning, forcing a PA system to be brought out and throwing a wrench in the rhythm of the morning talks.
  • I have to give props to Dr. Nic for avoiding getting burnt by the toaster incident and handling it really well. To boot, he gave one of the most entertaining talks at the conference, as the RubiGen video is sure to become an instant conference classic much like Adam Keys’ one-man-one-act event from last year.
  • Werewolf: I played one game, miserably. I was a werewolf, and when cornered by another in the game, mustered up the quote “I’m not an aggressive player, I prefer to feed off of other people.” Wow, what a freudian slip. While I can sympathize with Charlie’s comments about the game (and I do really enjoy late-night hackfests), I also have to agree with Chad and the other commenters that the two are not mutually exclusive, and the Werewolf games are wonderfully inclusive of RubyConf newbies and veterans alike.
  • The two-track approach in the afternoon this year seemed to go well, despite making it impossible to see all the talks. I would have liked to have seen Erik Hatcher’s Solr talk, but instead decided to give moral support to Kyle Maxwell’s JRuby in the Wild talk. I also missed the Saturday afternoon tracks to hang out in Stu’s Refactotum session.
  • Lots of good quotables: check out Nihilist and Twitter for some of the back-channel chatter.

See you next year!

Tags ,  | no comments

RubyConf Day 3: Behaviour-Driven Development with RSpec

Posted by Nick Sieger Sun, 04 Nov 2007 16:26:00 GMT

David Chelimsky and Dave Astels: RSpec

describe TestDriverDevelopment do
  it "is an incremental process"
  it "drives the implementation" 
  it "results in an exhaustive test suite"
  # but also...
  it "should focus on design"
  it "should focus on documentation"
  it "should focus on behaviour"
end

class BehaviourDrivenDevelopment < TestDrivenDevelopment
  include FocusOnDesign
  include FocusOnDocumentation
  include FocusOnBehavior
end

When doing test-driven development:

  • Write your intent first. The smallest test you can that fails.
  • Next, write the implementation. The simplest thing that could possibly work.
  • Even though you may be tempted to think about additional edge cases, multiple requirements, etc., you should try to be disciplined and focus only on the immediate tests. Only after you’ve made one test fail, then pass, can you continue on to other tests.

RSpec history

Initially BDD was just a discussion among Aslak Hellesoy and Dan North in the ThoughtWorks London office. Dave Astels joined the conversation with a blog post stating that he thought these ideas could be easily implemented in Smalltalk or Ruby. Steven Baker jumped in with an initial implementation, and released RSpec 0.1. Later in 2006, maintenance was handed over to David Chelimsky. RSpec has evolved through a dog-fooding phase up to the present 1.0 product.

BDD is no longer just about “should instead of assert”, it’s evolving into a process. Emphasizing central concepts from extreme programming and domain-driven design, it’s moving toward focusing on customer stories and acceptance testing. It’s outside-in, starting at high levels of detail, rather than low-level like RSpec or Test::Unit.

Story Runner

Story Runner is a new feature intended for RSpec 1.1. Each story is supposed to capture a customer requirement in the following general template:

As a (role) ... I want to (some function) ... so that (some business value).

It uses a “Scenario … Given … When … Then …” format to express the high level stories. Scenarios are a series of given items, steps, and behaviour validations. Once the basic steps are established, they can be re-used. David even demonstrated a preview of an in-browser story runner that would allow the customer to play with the implementation and create new scenarios.

Pending

Pending is a nice way to mark specs as “in-progress”. You can either omit a block for your spec, or use pending inside the block to leave a placeholder to come back to.

describe Pending do
  it "doesn't need a block to be pending"
  it "could also be specified inside the block" do
    pending("TODO")
    this.should_not be_a_failure
  end
  it "could also use a block with pending, and you will be notified when it starts to succeed" do
    pending("TODO") do
      this.should_not be_a_failure
    end
  end
end

Behaviour-Driven Development in Ruby with RSpec is a new book David and Aslak are working on, due out early next year.

Update: David has posted his slides.

Tags , ,  | no comments

RubyConf Day 2: Morning Sessions

Posted by Nick Sieger Sun, 04 Nov 2007 02:12:00 GMT

John Lam: IronRuby

Why IronRuby? John started with RubyCLR, which was a bridge between two languages/environments (.NET CLR and Ruby). Last year he didn’t know he’d be uprooting his family from Toronto and moving to Seattle. Now he finds himself in Microsoft trying to make sense of his new position. He describes a number of higher level goals for himself and IronRuby at Microsoft.

Change or die. Involvement in open source can only go up, right? The challenge is that the company is already doing well, so it’s hard to convince middle management that anything should change.

Open source. To their credit, the IronRuby team appears to be on the leading edge of open source at Microsoft (c.f Microsoft Public License). They also had planned all along to take external contributions, and have in fact started to receive them

Rails. One of the key goals is to be true to the language, and that includes being able to Run Rails.

Performance. Use IronRuby as a testbed for DLR performance testing.

John is showing the REPL now (running under Mono actually), pointing out that “integer math is now supported” (apparently early on someone pointed out that subtraction didn’t work) and that CLR list types automatically appear like Ruby arrays.

Heavy DLR pitch ahead. Performance history, how the CLR used to be slow for dynamic languages, and how it’s better now.

John is running the Rubinius specs now, and showing only 373 out of 1030 failing. (It looked like he was running the core specs only.) Praise for the Rubinius team!

It’s possible to bind C# types to Ruby using annotations. Lots of C# code being shown, including a mess of generated code.

John also showed a XAML/Silverlight demo that was scripted by Ruby.

Charles Nutter and Thomas Enebo: JRuby

JRuby: “Not Just” JRuby for the JVM. I found it hard to take notes for this talk since I’m so close to it. Fortunately, their slides were pretty verbose and comprehensive, and hopefully will be posted shortly.

Evan Phoenix: Rubinius

Rubinius talk in roller derby mode. Ask questions early and often.

What is the end game of Rubinius (or JRuby, or IronRuby)? Total. World. Domination. For Ruby!

Rubinius is 3 things: form, function, and elbow grease. Ruby::Syntax, Ruby::Behavior, and Google.search("crazy cs papers").

Rapid fire CS Nerd attack mode coming. Generational collection, bytecode execution, stackless, bytecode represenation, .rba archives.

Who would rather program C than Ruby? Java? C#? (Only one guy raised his hand that he’d rather code C.)

Hard-hitting portion of the talk. The kernel, broken down.

  • 1.8

    • 84,516 lines of C
    • 0 lines of Ruby
  • 1.9

    • 128,786 lines of C
    • 0 lines of Ruby
  • IronRuby

    • 48,282 lines of C#
    • 0 lines of Ruby
  • JRuby

    • 114,507 lines of Java
    • 0 lines of Ruby*

(*Even though I got heckled for saying it, JRuby does actually have some code written in Ruby that’s not the standard library.)

  • Rubinius
    • 25,398 lines of C
    • 13,946 lines of Ruby

1.8 and 1.9 are really Ruby for C programmers. JRuby is Ruby for Java programmers. IronRuby is Ruby for C# programmers. But Rubinius is Ruby for Ruby programmers.

Dogfooding. Gives feedback, which enables tighter loops, improves the kernel, makes life better for everyone on the platform.

Road, rubber, all that jazz. Evan mentions that Rubinius runs 24 of 31 benchmarks faster than Ruby 1.8, but the numbers are shifting rapidly. Evan wanted a 1.0 for RubyConf, but he has come to realize that several things are more important than a milestone. Design, and the technical challenges, certainly. But more importantly, the community.

Taking a cue from the Perl 6 community, -Ofun. The free-flowing commit bit, where patch sumbitters whose patches are accepted are immediately entitled commit rights, has given rise to 57 committers. 17 of these have changed more than 400 lines of code.

Tags ,  | 3 comments

RubyConf Day 1: Morning Sessions

Posted by Nick Sieger Fri, 02 Nov 2007 15:35:00 GMT

Marcel Molina: What Makes Code Beautiful?

What is beauty? Marcel explores this topic, starting with posing the question to the audience. “My wife!” Marcel: Why is she beautiful? “Longer answer than you want!”

Marcel comes from a literature/linguistic background, and is interested in how meaning is conveyed, but even beyond the basic words themselves, but the context and expressivity as well.

Note: Marcel has given this talk before.

History of beauty

Pythagoras: was out in the street, heard the blacksmith’s clanging hammer, and was drawn to the noise. He recognized, through closer inspection, that the different sounds that came from the different hammers had relationships, and eventually saw similar relationships in other parts of nature, architecture, and so on.

Thomas Aquinas: Three things that define beauty: 1. Proportion. The economy of size and ratio of parts. The smallest thing that works. 2. Integrity. Well-suited for the purpose. 3. Clarity. Clear and simple.

Each of the qualities are necessary, but none are sufficent. For example proportion (economy) will often clash with clarity. This is especially true in code.

Applied to software

Case study: coercion. Converting XML strings into rich Ruby equivalents. Marcel’s initial solution was a CoercibleString < String, which used a generator to iteratively try to coerce XML attributes to a number of types, and return the results. ~20 lines of code to convert to 4 types. His second version was a simple class method on String with a case statement.

Kent Beck, in his book Smalltalk: Best Practice Patterns, writes a book about writing good software, but in Marcel’s opinion, arrives at a definition of beauty by describing aspects of code that reflect proportion, integrity, and clarity.

Niels Bohr: “An expert is a person who has made all the mistakes that can be made in a very narrow field.” Marcel calls his CoercibleString a mistake, but one that helped him learn more about coding.

Luckily for us, Ruby is optimized for beauty.

Jim Weirich: Advanced Ruby Class Design

Emphasizing “Ruby” more so than “Advanced”, through three examples that illustrate techniques not commonly found in statically-typed OO languages (Java/C++/Eiffel).

Rake::FileList

FileList['lib/**/*.rb']

FileList sports globbing, a specialized to_s, and lazy evaluation. First version: class FileList < Array; end. Good idea, right? Well, with lazy evaluation, resolution of filenames happens only when the list is accessed, not created, so a lot of methods need to be overloaded:

def [](index)
  resolve unless @resolved
  super
end

The problem becomes that FileList too closely mimics Array, and cannot distinguish itself in the case that matters. So it was changed to delegate to array rather than inherit.

Moral: when you want to mimic built-in classes, it might be better to implement #to_ary or #to_str rather than inherit.

Builder::XmlMarkup

What’s the problem here?

  b = Builder::XmlMarkup.new
  b.student do
    b.name "Jim"
    b.phone_number "555-1234"
    b.class "Intro to Ruby"
  end
 end

class is already a method on Object. This begat BlankSlate, which removes unnecessary methods from Object. Several techniques were applied to eventually arrive at the latest version:

  • Use undef_method to hide methods that we don’t want. Except, leave methods beginning with double-underscore alone (__id__ and __send__).
  • Catch new methods added via a method_added hook on Kernel, and an append_features hook on Object, to deal with methods defined and modules included after BlankSlate was created

TableNode

Problem: magic conversion of Rails conditions to SQL. An example: User.find(:all).select{|u| u.name == "jim"}. We don’t really want to load the entire database to do this, but we don’t like writing SQL either.

Solution: Record the actions in the select block by yielding a special TableNode object that captures the method calls and translates to SQL on the fly. Now we can write User.select {|u| u.name == "Jim"} and have it still execute SQL

  • Capture methods called and wrap in a MethodNode to convert to SQL column references
  • Capture operators and wrap in a BinaryOpNode to handle ==, <, etc.

Clever! Will this work? Here are some issues:

  • Small issue – ordering: User.select {|u| "Jim" == u.name} will not work without messing with String#==.
  • Bigger issues: && and || are not override-able in Ruby. What’s worse, ! has pre-defined semantics (in the parser) and cannot be captured.

Lessons learned

  • Don’t be afraid to think beyond prior experiences to come up with new ways of solving problems in code.

Tags ,  | no comments

Older posts: 1 2 3 4 ... 13