RailsConf: Stefan Kaes - Rails Performance
Posted by Nick Sieger Fri, 23 Jun 2006 20:40:00 GMT
Stefan starts by citing a factor of 4-5 improvement in performance in Rails over the last year.
Performance, broken down
- Latency -- how fast
- Throughput -- how many
- Utilization -- how idle is the cpu
- Cost efficiency -- performance per unit cost
For completeness calculate the min, max, mean and standard deviation of these metrics and use the deviation as your guide for how reliable the data is.
- Log files (level >
- Rails Analyzer Tools (Eric Hodel)
- Benchmarker (
- DB vendor tools
- Apache bench (
- railsbench (Stefan Kaes)
Railsbench measuress raw performance of rails request processing.
It’s configured using
config/benchmarks.rb. These files let you control which requests
get benchmarked, whether to create a new session when benchmarking
- Ruby Profiler
- Zen Profiler
- Rails profiler script
- Ruby Performance Validator
At this point Stefan gave an overview of RPV, which appears to be a nifty tool that lets you get typical hotspot tree views of where time is spent in code. It currently only runs on Windows.
Top Rails Performance Problems
- slow helper methods
- complicated routes
- associations -- navigating and eager loading vs. proxy loading
- retrieving too much data from the DB
- slow session storage (e.g., ActiveRecord store)
Stefan says that in his experience, DB performance is generally not a big factor or bottleneck. Instantiating ActiveRecord objects is expensive, though.
- In memory -- if you server crashes...oops. Also doesn’t scale.
- File system -- easy to set up, scales with NFS, but slower than...
- ActiveRecordStore -- easy to set up since it comes with Rails, but much slower than...
- SQLSessionStore -- which uses the same table structure as ActiveRecordStore, but was written by Stefan to overcome performance issues with ActiveRecordStore. Setup is more involved.
- memcached -- slightly faster than SQLSessionStore, scales best, but setup is also more involved.
- DrbStore -- distributed ruby store
- Full pages -- fastest, complete pages are served on the filesystem. Web server bypasses appserver for rendering. If you have private pages, you can’t use it.
- Actions -- pages are cached after an action is rendered. The user ID can be used as part of the storage key.
- Fragments -- fragments can be cached in memory, on the file system, in a DrbStore, or in memcached. Memcached scales the best but doesn’t support expiring fragments by regular expression.
- Stefan recommends avoiding components, and replacing them with helpers or partials. He has not found a use for them.
- Don’t create unnecessary instance variables in the controller;
creating them in the view with
instance_variable_setand accessing with
- pluralize -- don’t use the inflector if you don’t need to, it’s expensive.
- linkto and urlfor are among the slowest helpers, since they need to use routes. Instead, if you have page with lots of links, you might consider hard-coding the links. This reduces the amount of GC by up to 50% and the GC time down by a few percentage points (11.3% to 8.7% of total processing time).
- use the
:includeoption to prefetch associations, it avoids extra onesy-twosy SQL statements.
- use piggy-backing plugin for
belongs_torelationships -- allows you to retrieve extra attributes from additional tables in the same fetch query.
- Field values are retrieved from the DB mostly as strings, so type conversion happens on each access, which can be slow.
Language-level and miscellaneous issues
- Method calls are the slowest -- don’t needlessly create method abstractions
- Short-circuit intermediate results to improve performance
- Cache results in instance variables or class variables
- Don’t call
ObjectSpace.each_objecton each request
Ruby Memory Management
- designed for batch scripts, not long-running servers.
- no generational garbage collection.
- this is suboptimal for Rails because ASTs are stored on the heap (biggest portion of non-garbage for Rails apps), and get processed/traversed more often than they need to be
- Railsbench includes a patch to allow one to recompile Ruby and tweak the garbage collector.
Rails Template Optimizer
- Stefan has started a project to “compile” templates.
- The idea is to cache results of some ERb scriptlets and essentially “compile” or replace the template with one that has more expressions expanded or inlined.
- Code forthcoming; I assume you can stay tuned to Rails Express for news.
- There was a question on JRuby -- Stefan replied that it would certainly solve GC issues, but he doesn’t know if it’s in a state to be able to benchmark Rails requests.
- What are your recommendations for a web server. I don’t have any.
- Is horizontal or vertical scaling better? I don’t know, I’ve been focused on making single requests go fast, so I don’t have enough experience.