JRuby and the Permanent Generation
Posted by Nick Sieger Thu, 21 Feb 2008 05:52:44 GMT
One of the aspects we have to work around building and improving a dynamic language implementation on the Java Virtual Machine is the way the JVM loads and executes bytecode. In order for JRuby to take advantage of the Hotspot just-in-time (JIT) compiler, JRuby needs to generate Java bytecode at runtime, during execution of Ruby code. If that bytecode gets executed often enough and meets certain other rather mysterious conditions, Hotspot will turn it into machine code.
Unfortunately the VM was originally designed to run one language well, and that’s Java. The only way to get bytecode loaded into the JVM is through the Java class loader. With Java, most if not all the code is compiled to bytecode ahead of time and the VM assumes (and optimizes) for the fact that classes will not be shuffled around in memory. Partly due to these assumptions, the JVM stores bytecode along with other class metadata in a separate heap called the “permanent generation”, or just “permgen”. (I’m guessing the name “permanent” was used because originally objects in this heap were probably not garbage collected).
However, because JRuby needs to provide the amount of dynamism that a Ruby programmer would expect (open classes; modules included; methods added/removed at any time), a Ruby class does not cleanly map one-to-one with a Java class. Instead, it’s easier to think of Ruby classes as method-bags. As a result, JRuby creates a new Java class for every Ruby method that it decides to compile down to bytecode. Additionally, since Ruby methods can come and go, in order for the method to be collectible by the garbage collector, it needs to live in its own class loader.
Of course, JRuby is not the first Java program to generate classes and load them at runtime (JSPs have been doing this for ten years). But it may create more class and class loader garbage than just about any program ever run on the JVM. For small programs, generating a class per method would be no big deal, but consider a Rails application: The Rails codebase itself has thousands of methods, but it also generates plenty of new methods at runtime.
Consider a non-trivial Rails application that makes liberal usage of the Ruby standard library, and also uses a handful of plugins, and the number of methods available for JRuby to compile can easily exceed 10,000. If the average overhead of a single JRuby method class is around 8K (varying due to method size, of course), this would occupy up to 80 megabytes of permgen space. (By contrast, the JVM’s default size of the permgen space is 64 megabytes, so we’re already over the limit). Now consider that, with Goldspike we need to use multiple JRuby runtimes in order to achieve concurrent requests due to Rails’ lack of thread-safety, and the number is multiplied further. If you were to deploy 4 Rails applications each with 4 active runtimes into a single application server, you’re looking at almost 1.2 gigabytes of permgen space necessary to run your applications! (Usually, it’s common to run multiple applications in a Java application server, but with Rails applications that may need to be reconsidered.)
Because of this multiplicative cost, shortly after JRuby 1.1RC1 was released we took the somewhat drastic measure of capping the number of methods that each runtime would JIT-compile to 2048. But after a while it became obvious even with a threshold-based approach, JRuby was still wasting a ton of permgen space with duplicate copies of compiled methods. So for 1.1RC2 we introduced a JIT cache that could be set up to be shared among multiple runtimes.
The figure below shows the effect. Under consideration is a single JRuby/Rails/Goldspike application deployed in Glassfish, with varying numbers of runtimes, right after deployment (cold) and after some warmup to load more Rails code and allow JIT to reach the method threshold. JRuby trunk revision 5545 had the 2K JIT threshold, but not the sharing. Revision 5931 is right before RC2 was released, with the method cache wired up. (We also took some measures to reduce permgen consumption in those 500 revisions, so some of that is visible as well.)
The just-released Goldspike 1.5 will use this shared method cache by default, so all you’ll have to do is upgrade to receive the benefits. An easy way to do that is to install Warbler 0.9.3, an update which bundles Goldspike 1.5 and JRuby 1.1RC2.
These kind of techniques to reduce JRuby’s permgen overhead are only going to go so far when the underlying VM still isn’t expecting to be abused in this way. That’s why we’re looking forward to a JRuby that will be able to take advantage of proposed future enhancements like anonymous classes/method handles as part of John Rose’s Da Vinci Machine project. For more information along those lines, head over to Charlie’s discussion of what comes next.
Hello Nick:
Thanks for writing this up and publishing it. Currently our package contains among a number of straight java coded WAR files, three war files which are ROR applications.
With an out of the box GlasshFish domain (I think heap there is set to some 300 MB), I am seeing over 700 MB of RAM being used shortly after we start using our apps. I was suspecting it is likely a permgen issue based on reading some of JRuby IRC logs.
We are using your Warbler plugin as well (Thank you) and I shall soon be upgrading to RC2.
I am curious to what are your thoughts on couple of things.
Specifically in our scenario, we have three ROR apps as WARs in single GlassFish instance. Do you think it would significantly save memory by having a single ClassCache for all of GlassFish instance (to be shared between the three apps), Also I am not sure how big of a code change would this be.
In some of my perf. benchmarks, I see the number keeps improving after several runs (the more warm up, the faster it gets). I wonder using the default setting of limiting compiled methods to 2048 will not drastically reduce the perf. improvements of the app after warm up. If not drastic, we probably would leave it there with the default.
Enjoy reading your blog on JRuby stuff. I have met with Charles when he came down to Bangalore and love and appreciate all the work you folks do.
PS: I am planning on writing a series of blog topics covering our use of ROR and JRuby as I think we are probably one of the few attempting to use ROR and JRuby in a decent size app and it would be good to share our experience.
Thanks, Venkat.
Venkat, thanks for the kind comments. Regarding your questions:
It might save some memory, but I wouldn’t worry about it unless you’re really memory-constrained. I’d suggest running with ~256MB permgen and probably at least 1G of heap if you can afford to.
We haven’t done a lot of testing here yet either, but I suspect performance would be helped by bumping up the threshold. So, I’m not sure how much is to be gained by changing it yet. Certainly there will be a trade-off of performance for memory usage, but I don’t know how much yet.
Cheers, /Nick