JRuby and the Permanent Generation

Posted by Nick Sieger Thu, 21 Feb 2008 05:52:44 GMT

One of the aspects we have to work around building and improving a dynamic language implementation on the Java Virtual Machine is the way the JVM loads and executes bytecode. In order for JRuby to take advantage of the Hotspot just-in-time (JIT) compiler, JRuby needs to generate Java bytecode at runtime, during execution of Ruby code. If that bytecode gets executed often enough and meets certain other rather mysterious conditions, Hotspot will turn it into machine code.

Unfortunately the VM was originally designed to run one language well, and that’s Java. The only way to get bytecode loaded into the JVM is through the Java class loader. With Java, most if not all the code is compiled to bytecode ahead of time and the VM assumes (and optimizes) for the fact that classes will not be shuffled around in memory. Partly due to these assumptions, the JVM stores bytecode along with other class metadata in a separate heap called the “permanent generation”, or just “permgen”. (I’m guessing the name “permanent” was used because originally objects in this heap were probably not garbage collected).

However, because JRuby needs to provide the amount of dynamism that a Ruby programmer would expect (open classes; modules included; methods added/removed at any time), a Ruby class does not cleanly map one-to-one with a Java class. Instead, it’s easier to think of Ruby classes as method-bags. As a result, JRuby creates a new Java class for every Ruby method that it decides to compile down to bytecode. Additionally, since Ruby methods can come and go, in order for the method to be collectible by the garbage collector, it needs to live in its own class loader.

Of course, JRuby is not the first Java program to generate classes and load them at runtime (JSPs have been doing this for ten years). But it may create more class and class loader garbage than just about any program ever run on the JVM. For small programs, generating a class per method would be no big deal, but consider a Rails application: The Rails codebase itself has thousands of methods, but it also generates plenty of new methods at runtime.

Consider a non-trivial Rails application that makes liberal usage of the Ruby standard library, and also uses a handful of plugins, and the number of methods available for JRuby to compile can easily exceed 10,000. If the average overhead of a single JRuby method class is around 8K (varying due to method size, of course), this would occupy up to 80 megabytes of permgen space. (By contrast, the JVM’s default size of the permgen space is 64 megabytes, so we’re already over the limit). Now consider that, with Goldspike we need to use multiple JRuby runtimes in order to achieve concurrent requests due to Rails’ lack of thread-safety, and the number is multiplied further. If you were to deploy 4 Rails applications each with 4 active runtimes into a single application server, you’re looking at almost 1.2 gigabytes of permgen space necessary to run your applications! (Usually, it’s common to run multiple applications in a Java application server, but with Rails applications that may need to be reconsidered.)

Because of this multiplicative cost, shortly after JRuby 1.1RC1 was released we took the somewhat drastic measure of capping the number of methods that each runtime would JIT-compile to 2048. But after a while it became obvious even with a threshold-based approach, JRuby was still wasting a ton of permgen space with duplicate copies of compiled methods. So for 1.1RC2 we introduced a JIT cache that could be set up to be shared among multiple runtimes.

The figure below shows the effect. Under consideration is a single JRuby/Rails/Goldspike application deployed in Glassfish, with varying numbers of runtimes, right after deployment (cold) and after some warmup to load more Rails code and allow JIT to reach the method threshold. JRuby trunk revision 5545 had the 2K JIT threshold, but not the sharing. Revision 5931 is right before RC2 was released, with the method cache wired up. (We also took some measures to reduce permgen consumption in those 500 revisions, so some of that is visible as well.)

permgen

The just-released Goldspike 1.5 will use this shared method cache by default, so all you’ll have to do is upgrade to receive the benefits. An easy way to do that is to install Warbler 0.9.3, an update which bundles Goldspike 1.5 and JRuby 1.1RC2.

These kind of techniques to reduce JRuby’s permgen overhead are only going to go so far when the underlying VM still isn’t expecting to be abused in this way. That’s why we’re looking forward to a JRuby that will be able to take advantage of proposed future enhancements like anonymous classes/method handles as part of John Rose’s Da Vinci Machine project. For more information along those lines, head over to Charlie’s discussion of what comes next.

Tags ,  | 2 comments

Screencast: RSpec and NetBeans

Posted by Nick Sieger Fri, 08 Feb 2008 15:13:16 GMT

A new screen cast is up with yours truly showing off NetBeans’ RSpec support. Additionally, I tried to make it interesting to a wider audience by really showcasing RSpec’s strengths, and trying to capture some of the red-green-refactor rhythm. NetBeans does work really well for this, but in my mind, the star of the show is RSpec.

I’m pleased with how it turned out considering I hadn’t done this sort of thing before. Special thanks to Cindy Church for putting it all together, including all the production: setup, recording, editing, even the music!

A QuickTime movie version is available as well. Check it out and let me know what you think.

Tags , , ,  | 1 comment

Why DTrace Makes Leopard a Must-Have Upgrade

Posted by Nick Sieger Tue, 05 Feb 2008 21:12:00 GMT

I feel like I’m actually a relative late-comer to Leopard, at least in my social circle. A lot of the folks in the Ruby community already had it installed the week after it was out, and were showing it off at RubyConf back in November. I just didn’t have a compelling reason to upgrade and disrupt my workflow at the time. Plus, mixed reports were coming out about data loss, UI nits, and other instabilities.

By the time I went out to purchase, 10.5.1 was already the version boxed in the stores, and in retrospect, it seemed worth the wait. I haven’t had a single complaint or major issue with the upgrade so far, and have been enjoying the noticeable zippiness of a freshly-installed system.

Time Machine has been a widely-publicized feature, and has been touted as one of the top reasons to upgrade. So I bought a small portable drive with some leftover holiday gift cards and set out to try it. Initially it seemed promising, except after a day or two of backups the process would stall out during the “preparing” stage. Eventually I noticed that the TM background process, backupd, was eating up 0.5GB of memory and up to 100% of one of the CPUs.

If I wasn’t a nerd making my living having my way with computers, I probably would have given up on Time Machine at this point, after a couple of hours scouring Google and the Apple discussion boards searching for similar problems. But I knew that backupd had to be doing something pathological, and I was compelled to find out what.

On Solaris systems, truss is usually the order of the day for problems like this. It literally vomits an endless listing of system calls invoked by a process into your terminal window. Except there’s no truss on OS X. Is there a replacement? Google mentioned ktrace, present on Tiger systems and earlier, but it’s gone in Leopard. Replaced by? DTrace.

Ahhh, DTrace! Another geeky Leopard-only feature. Certainly DTrace will be able to trace system calls in the same manner as truss. But being a complete DTrace newb, I had no idea where to start. So, like any lazy programmer does, I started shopping around for examples to get me started. Looking around, this article on MacTech looked promising, but didn’t have what I needed. Eventually, I ended up finding the DTrace Toolkit on the OpenSolaris site.

The DTrace Toolkit appears to be your one-stop shop for all things DTrace. If you need a kick-start reason to take a look at DTrace and get you going, this is it. In my case, lo and behold, one of the scripts included in the toolkit is called dtruss!

Many of the scripts in the toolkit are tailored towards a Solaris system, and dtruss is a prime example. It won’t quite work out of the box on Leopard, because a few of the system calls mentioned in the script are non-existent there. Changing the shebang line at the top of the script to #!/bin/bash, and repeatedly running it a few times with sudo ./dtruss -p <pid> will give you an idea of which ones; I simply commented these out until I was successfully able to trace a process.

Now, finally we can pop the stack back to my original problem with Time Machine and backupd. I launched a backup run and waited for the process to start consuming large amounts of CPU and memory. I located the PID of the process in Activity Monitor, and started tracing it with my modified dtruss script. And, sure enough, I saw the following output scrolling by endlessly in my terminal:

mmap(0x0, 0x5000, 0x3)     = 958464 0
getdirentriesattr(..., ..., ...)    = ....
munmap(0xEA000, 0x5000)    = 0 0
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93
getxattr(..., ..., 0x0)     = -1 Err#93

(The ellipsis were actual memory addresses, I didn’t save the output.) What was interesting is that the same chunk of memory (the first argument to getxattr) was floating by repeatedly. Looking at the man page for getxattr, the signature is:

ssize_t
getxattr(const char *path, const char *name, void *value,
         size_t size, u_int32_t position, int options);

So, the first argument contains the path. Now, how can I get the contents of that memory address? The answer is inside dtruss. Closer to the top of the script is this DTrace code:

/* print 3 args, arg0 as a string */
syscall::stat*:return, 
syscall::lstat*:return, 
syscall::open*:return
/* not on leopard -- syscall::resolvepath:return */
/self->start/
{
    /* calculate elapsed time */
    this->elapsed = timestamp - self->start;
    self->start = 0;
    this->cpu = vtimestamp - self->vstart;
    self->vstart = 0;
    self->code = errno == 0 ? "" : "Err#";

    /* print optional fields */
    OPT_printid  ? printf("%6d/%d:  ", pid, tid) : 1;
    OPT_relative ? printf("%8d ", vtimestamp/1000) : 1;
    OPT_elapsed  ? printf("%7d ", this->elapsed/1000) : 1;
    OPT_cpu      ? printf("%6d ", this->cpu/1000) : 1;

    /* print main data */
    printf("%s(\"%S\", 0x%X, 0x%X)\t\t = %d %s%d\n", probefunc,
        copyinstr(self->arg0), self->arg1, self->arg2, (int)arg0,
        self->code, (int)errno);
    OPT_stack ? ustack()    : 1;
    OPT_stack ? trace("\n") : 1;
    self->arg0 = 0;
    self->arg1 = 0;
    self->arg2 = 0;
}

I only had to add syscall::getxattr:return to the list of matched probes, and now I could finally inspect the path argument to getxattr:

munmap(0xEA000, 0x5000)      = 0 0
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002680.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002681.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002682.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002683.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002684.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002685.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002686.emlx\0",
0x967288D4, 0x0)         = -1 Err#93
getxattr("/Users/nicksieger/Library/Mail/IMAP-nicksieger@gmail.com
@imap.gmail.com/[Gmail]/All Mail.imapmbox/Messages/1002687.emlx\0",
0x967288D4, 0x0)         = -1 Err#93

D’oh! GMail! That directory has so many files, it took over two minutes just for ls to list all 487356 of them. Hundreds of thousands of email messages, all being re-inspected by Time Machine every time some new messages are added to the directory. I’ll leave it to someone else to point fingers at what the actual problem is here (a side-effect of TM’s usage of hard links? Mail.app inefficiently storing too many messages in a single directory?), but after all this I just decided that I didn’t want a backup of my GMail messages since they’re stored on the server. So I added the directory to the list of excluded directories in TM, wiped my backup, and started over. (TM had similar problems trying to complete an incremental backup with the existing backed-up copy of my mail on the backup disk, so I decided to wipe it and start fresh.) I won’t declare the problem completely solved yet, but if it happens again, I’ll just repeat this process to find the new culprit. Hopefully I don’t end up excluding my entire home directory!

This whole process was a revelation to me – the fact that I could pinpoint the exact problem in a piece of system software despite having few notions of the internals of that software. The next time I have a nagging issue in Leopard, DTrace will be my tool of choice in tracking it down. Let’s just hope it’s not a problem with iTunes!

Tags , ,  | 9 comments