Nick Sieger

JRuby Performance Tweaks

Posted by Nick Sieger Thu, 25 Oct 2007 16:04:17 GMT

The back-story for my post on JRuby performance is that I was actually doing some benchmarking while applying successive tweaks to the JRuby environment to see how it affected our application performance. Only after running the final set of tweaks for longer numbers of requests did I notice that JRuby was catching up to MRI!

I thought it would be interesting to point out the tweaks themselves, to reiterate the point that JRuby and the JVM give you lots of knobs to tune your application. The notepad of tweaks, numbers and the script used to run them is here.

I started by running the application with the “out of the box” setup: JRuby 1.0.1. I then started to apply tweaks to see how the numbers changed.
The first tweak was to turn off ObjectSpace, which is pure overhead for JRuby.
Next, I enabled JREXML since our application uses ActiveResource.
Next, I tried enabling Charlie’s discovery in the rexml/source. It appears that the benefit was negated by JREXML, so I went back and ran another run with the rexml/source patch and without JREXML as 2a. 2a gave almost as much benefit by itself as JREXML, so that’s an option if you don’t want the additional dependency, but you should measure for yourself since the performance profiles of those two fixes for XML may differ depending on the amount of XML consumed. In our case it’s relatively small, a few kilobytes at most.
Next, I switched to JRuby trunk instead of 1.0.1. Trunk has, among other improvements, a complete compiler, which should allow more of the application to be translated to Java bytecode.
The last tweak for this study was to change to the “server” VM. The server VM is known to take longer to warm up, but is more aggressive in the optimizations it performs.

The beauty of this exercise is that all the changes made provided small performance boosts for the application. Going forward, we hope to make more of this baked-in behavior (ObjectSpace off by default, possibly the rexml/source fix), but it will still help to have knowledge of and play around with the hotspot settings for the JVM.

There are a few more things I’d like to try in the future: JDK 6 is reportedly a lot faster all by itself, and the standalone Glassfish gem may give Mongrel a run for his money. There is still plenty of work left for the impending JRuby 1.1 release, so we should see the performance improvements for Rails applications running on JRuby continue to roll in.

Tags jruby | no comments

JRuby on Rails: Fast Enough

Posted by Nick Sieger Thu, 25 Oct 2007 03:36:00 GMT

People have been asking for a while how fast JRuby runs Rails. (Of course, “fast” has always been a relative term.) We haven’t been quick to answer the question, because frankly we didn’t know. We hadn’t been building real Rails applications on JRuby ourselves yet, and there was no definitive word from the crowd either.

Recently, several guys from ThoughtWorks have been working on a Rails petstore application and benchmark to get to the heart of the matter. Discussion has been heated on the JRuby mailing list, but results have not been conclusive yet.

In the project I’m working on, we’ve committed to using and deploying on JRuby. Eventually we were going to reach the point where we’d need to find out how well our application runs. So today I began running a simple single request benchmark on a relatively busy page. The numbers turned out to be rather surprising:

Requests

Average

(The raw data is available here.)

Now, MRI (C Ruby) will always run about the same speed no matter how many runs you give it, but it’s well known that the JVM needs time to warm up. And indeed it does; after 250 iterations, Mongrel running on JRuby finally surpasses MRI. The JRuby/Goldspike/Glassfish combo comes close as well.

Some details about the setup:

I ran the tests on my MacBook Pro Core 2 Duo 2.4 GHz. I didn’t disable one of the cores for the tests, which means that JRuby has an advantage over MRI because it can use both (native threads at work). However, the test script ran the requests serially, which means that the advantage was minimal.
The application is indeed of the “hydra” variety; the setup is nearly identical to the second diagram on that page. So a single request is passing through not one, but two Rails applications in addition to touching the database. It rendered an HTML ERb view with data from an ActiveResource-accessed RESTful service. The applications are based on Rails 1.2.3.
MRI version is using Ruby 1.8.6 and Mongrel 1.0.1.
JRuby Mongrel is also version 1.0.1 (details on installing it here)
JRuby on Glassfish used Glassfish 2 and Goldspike 1.4, deployed in war files via Warbler.
The two JRuby setups used JDK 1.5 and were tweaked to disable ObjectSpace and use the “server” VM (-server argument to the JVM).

The main point I wish to make with these numbers is that JRuby performance is there today, and still has room to grow. There’s no longer any doubt in my mind. Yes, this is a simplistic application benchmark run on a developer’s machine, but it’s a real application. The test may not be exacting in precision, but I see enough in the numbers to believe that this will be replicable to production environments. The plot thickens!

Tags jruby, rails, ruby | 1 comment

Product or Platform?

Posted by Nick Sieger Sat, 20 Oct 2007 03:34:45 GMT

This week, I attended the Sun 2007 Open Source Summit on the Sun Microsystems Santa Clara campus. This was a conference put on by Sun for Sun employees. I happened to be in the bay area, so I took it in.

Open Source at Sun

First off, let me say that it was comforting being in the company of 200+ colleagues, most of them smarter and more experienced in open source than me. These are the vanguard of the company, the people spreading the message inside and outside Sun about our decision to open source all of our software, and the ramifications of that. The message is in good hands, but the only downside is what a tiny fraction of the overall company this group represents. So for those of you observing Sun representatives seemingly going in many different directions, behaving subtly differently from project to project and community to community, we want to hear your feedback. The fact that we’re meeting this week can certainly be an indicator that we’re still figuring out how to do open source effectively, to retrofit the message into the company culture, to build and contribute to communities and to raise all boats.

Some of the topics discussed include:

Community building (around Sun-originated projects)
Community participation (in projects external to Sun)
Open (source) testing, documentation, and standardization
Marketing and legal: podcasting, public relations, trademarking issue
Going open source on a previously closed-source product, dealing with code encumbrances, etc.

Product or Platform?

Among other topics discussed is the notion of a product versus a platform, and what it means for an open source project and its surrounding community. Dalibor Topic brought up this point in a panel session discussing the perception of Sun from the outside that included several participants external to Sun.

Without providing definitions of those two, take a moment and think of a few thriving open source communities. Would you think of these projects as products or platforms? In general, does one kind of project foster community better than the other?

Dalibor pointed out that projects that form platforms tend to be more fertile ground for community-building. Why is this? After discussing this with him, a few attributes came to mind.

Variety. Platforms are likely to be more extensible and offer more modularity and variety. As a consequence, developers have more possibilities to “scratch their itch”.

Loosely connected communities. Platforms are more likely to be broken into component projects, and each component offers a more intimate community setting. The smaller size is an incentive to developers to invest time, as the reward of gaining a reputation as a community leader becomes easier to visualize and obtain. Thriving communities have a face, and that face should be a real person. The chance to be the face of a project and gain the prestige that comes with it is a big incentive for open source developers.

Low barrier to entry. The sub-projects’ codebases are not gargantuan or monolithic. Developers can bootstrap more quickly and get to the stage where they’re contributing much faster. As a counter-example, consider three of Sun’s largest open source projects: OpenJDK, OpenSolaris, and OpenOffice. Checking out source from source control, building and running these codebases is not a 10 minute proposition, and that presents a big barrier to entry.

Balance of contributions. If the barrier to entry is lower, the likelihood of attracting contributors is higher. Projects are more likely to be seen as worthwhile if contributions are not lopsided and they are not controlled by a single entity.

OpenJDK: Pushing Toward “Platform”

Dalibor stated in the panel session that he thought Sun had done an admirable thing by releasing OpenJDK under the GPL, and that everything had been executed well up to this point. OpenJDK is on the cusp of being a platform with a lot of energy and mojo, but there are still some barriers that need to be knocked down to allow the community to grow and prosper. Encumbrances and proprietary TCK licenses are one item, but the “product”-ness of the JDK are another. If OpenJDK can become a “platform” in the ways that Dalibor and I talked about, it will go a long way towards ensuring the long-term viability of the project, the community, and even the ubiquity of Java itself as a technology.

Footnote: Thanks to Simon Phipps and his team for leading an engaging, thought-provoking discussion. Looking forward to next time!

Tags openjdk, opensource, sun | no comments

Obscure and Ugly Perlisms in Ruby

Posted by Nick Sieger Sat, 06 Oct 2007 12:39:00 GMT

So, it’s well known that Ruby owes a debt to its predecessor Perl, although some (maybe many) question whether we should repay that debt or even go so far as to put Perl on trial and excise those elements which somehow haphazardly survived the generation gap. It turns out the evidence is mixed.

Update: I use the word “obscure” in the title because, in my experience, they are obscure. “Ugly” is pure opinion, but this is my blog, after all.

Exhibit A: BEGIN/END

Update: Yes, yes, this is an awk-ism, not a perlism, strictly speaking. And I don’t deny its usefulness for pure scripting tasks. I just don’t see its utility in a larger application.

END {
  puts "Bye!"
}

puts "Processing..."

BEGIN {
  puts "One moment while I start your program"
}

Output:

One moment while I start your program
Processing...
Bye!

Why would any sane Ruby programmer do this? Have you ever seen a use for BEGIN that isn’t met by simply executing code at the top level of the main program? Geez, BEGIN even has its own node in the AST!

And how about END? If you really need to hook into interpreter shutdown, just use Kernel#at_exit. (In fact, Rubinius currently uses END simply as an alias for at_exit.)

Exhibit B: <> (ARGF)

Thank goodness we didn’t get the diamond operator in Ruby, but we did get ARGF as a replacement. Though obscure, it actually turns out to be useful. Consider this program, which prepends copyright headers in-place (thanks to another perlism, -i) to every file mentioned on the command-line. Any other creative uses of ARGF out there?

#!/usr/bin/env ruby -i

Header = DATA.read

ARGF.each_line do |e|
  puts Header if ARGF.pos - e.length == 0
  puts e
end

__END__
#--
# Copyright (C) 2007 Fancypants, Inc.
#++

Exhibit C: The Flip-flop

This is a weird beast. I didn’t even know of its existence until Charlie was complaining about having to compile it properly. Apparently we have Perl to thank for this nonsense as well (and, indirectly, sed). With the exception of the sed-ism, I’m not convinced it adds any value -- in fact the code usually ends up looking more verbose.

This program, when run with itself as an argument, prints out everything between BEGIN and END.

#
# BEGIN

ARGF.each_line do |line| 
  if (line =~ /^# BEGIN/)..(line =~ /^# END/)
    puts line
  end
end

# END
#

This snippet is a long-hand way to do 5.upto(10) {|i| puts i}.

i = 5
while (i == 5)...(i == 10) do
  puts i
  i += 1
end

Exhibit D: Output from defined?

Not sure if this came from Perl.

The basic need for defined? in a dynamic language is unquestionable. Instead, I meant to highlight the fact that defined? returns a string value here, which is strange.

Constant =  "Constant"
@ivar = [1, 2, 3]
integer = 10

puts "const : #{defined?(Constant)}"
puts "ivar  : #{defined?(@ivar)}"
puts "global: #{defined?($0)}"
puts "local : #{defined?(integer)}"
puts "expr  : #{defined?(Constant + integer)}"

Running this code produces:

const : constant
ivar  : instance-variable
global: global-variable
local : local-variable
expr  : method

Perl at least is sane enough to return true or false for its own defined operator. But method? Looking at the source, I see also expression, local-variable(in-block), assignment, class variable, true, false, and self. But why would this output be useful? As if it isn’t already plainly obvious what is defined?.

Any other obscure features in Ruby that you love to hate?

Tags ruby | 24 comments

RailsConf Europe: Hydra

Posted by Nick Sieger Sat, 06 Oct 2007 12:04:00 GMT

Hydra

On September 19, Craig and I presented our talk at RailsConf, which appears to have been well received. It went off mostly without a hitch, if it wasn’t for a couple of hiccups in the demos. I apparently didn’t practice them enough, because a couple of critical steps were either missed or I did them out of order and confused myself. But that’s ok, because I’m releasing the demo steps, source and slides here so you can try them out for yourself.

So, download the zip and follow along. The contents look like this:

1-active-resource-basics.txt
2-make-resourceful.txt
3-atom-roller.txt
4-service-chatter.txt
RailsConfEurope-Hydra.pdf
demo/
demo-baked/

The demo steps are in the text files; I’d recommend going in the order specified. It turns out the third isn’t really a demo but more of a code review, because it also requires you to have Roller set up and I’m not going into those details for now. If you want to just run the demos, skip to demo-baked, the finished product.

For those of you who didn’t see the talk, here’s the basic message.

Look at your basic MVC Rails app.

Single app

Why not consider spliting it into two? ActiveResource allows you to access a RESTful resource in your Rails application like it was just another model.

Double app

This might seem like overkill for a simple application, but what if you had an e-commerce application domain like this?

E-Commerce app

Splitting up your code into separate Rails applications encourages encapsulation, reduces potential coupling, and gives you more flexible deployment options. Basing interactions upon REST and HTTP means that you can more easily mash up data or create caching strategies, given proper usage of ETags/Last-Modified and/or cache-control headers. The great thing is that existing HTTP reverse proxies can be used without having to mix the caching code in with your application code.

In the application we’re building, we have a number components that are not Rails-based. To expose them to our environment, we’ve taken the strategy of exposing a simple REST web service for the component, and then it can be consumed by the other applications using ActiveResource. The REST web service can either be implemented in the component’s native language/technology, or in some cases we’ve written a wrapper service in Rails since Rails makes it so easy to build REST interfaces. In that case, Rails is pure integration -- RESTful glue.

The idiom that makes this all possible is the uniform interface. In HTTP, this means addressability (each resource gets a unique URI) coupled with the HTTP method verbs HEAD, GET, POST, PUT, and DELETE. Inside your Rails applications, it’s the ActiveRecord interface. If you keep your controllers skinny, you can boil the interface down to the following set of methods (in this case, for the prototypical blog post model):

Post.new/Post.create
Post.find
@post.save
@post.update_attributes
@post.errors
@post.destroy

And in fact, this is precisely what ActiveResource provides, and it’s enabled by duck-typing. Walking through the demos illustrates this pretty well, as you’ll basically swap ActiveRecord for ActiveResource with no noticeable difference, all the way down to validation errors in the scaffolded forms (which I was unable to demo in the talk due to the hiccups).

We’ve found that when you’re making RESTful web services, the controllers largely become boilerplate because of the uniform interface. make_resourceful has been a boon in that regard, as the demos also show. There are several plugins that help you DRY up your controllers (other approaches include resources_controller), so you have some choices there.

We mentioned some drawbacks in our experience with ActiveResource, which have largely been addressed for the upcoming Rails 2.0 release.

Finally, we noted that deployment could be a pain with so many Rails applications to keep running. To that end, we are leveraging JRuby and Glassfish to make this a non-issue, as we simply WAR up our Rails applications with warbler and let Glassfish take care of the rest. Performance is still an open question, but we plan to roll up our sleeves and make sure this combination really hums.

Enjoy the demos! Feel free to drop me an email if you have any questions or troubles with them.

Tags railsconf, railsconfeurope2007 | 5 comments

Nick Sieger

JRuby Performance Tweaks

JRuby on Rails: Fast Enough

Product or Platform?

Open Source at Sun

Product or Platform?

OpenJDK: Pushing Toward “Platform”

Obscure and Ugly Perlisms in Ruby

Exhibit A: BEGIN/END

Exhibit B: <> (ARGF)

Exhibit C: The Flip-flop

Exhibit D: Output from defined?

RailsConf Europe: Hydra

Contact

Archives

Elsewhere

Pinboard (nicksieger)