RailsConf 2007: Chris Wanstrath: Kickin' Ass with Cache-fu

Posted by Nick Sieger Sat, 19 May 2007 20:34:00 GMT

Chris is here to talk about games, since he used to work for Gamespot. He coded PHP, which is like training wheels without the bike. He had to sit in a glass cube and help keep the site running during E3 last year. There were 100 gajillion teenage boys during their lunch break hitting refresh, and it all blew up. Couldn’t even gzip the responses, because the servers heated up to much. They served 50M pages in a day, without downtime. They did it with Memcache.

Memcache is a distributed hash -- multiple daemons running on different servers. Developed by Livejournal for their infrastructure, you just put up the servers, and they just work.

Should you use Memcache? No. YAGNI, UYRDNI (unless you really do need it).

Rails and Memcache

Fragments, Actions, Sessions, Objects, cache it all. You can use:

  • memcache-client (by Robot-coop guys/Eric Hodel). Marshal.unload is 40 times faster than Object.new/loading from the database.
  • CachedModel -- integration with ActiveRecord
  • Fragment Cache Store
  • Memcache session store

...or...

cache_fu

Or, acts_as_cached. It knows about all the aforementioned objects, with a single YAML config file (config/memcached.yml). Word to the wise: don’t use names in your server config file. Use IPs, avoid BIND and connections to the servers with every connection. Don’t let DNS outages bring down your servers.

  • get_cache
  • expire_cache

This is all you need -- if you’re using set_cache, you probably don’t understand how the plugin works. Expire cache on the “after save” hook, which allows you to cache ID misses as well.

class Presentation < ActiveRecord::Base
  acts_as_cached
  after_save :expire_cache
end

Example: only cache published items

class Presentation < ActiveRecord::Base
  acts_as_cached :conditions => 'published = 1'
end

Cached-scoped-finders (if somebody thinks of a good name, let Chris know). The idea is to move custom finder logic to a method on your model, and then wrap a cache-scoping thingy around it. cache_fu ties this up nicely by giving you a cached method on AR::Base.

class Topic < ActiveRecord::Base
  def self.weekly_popular
    Topic.find :all, ...
  end
end

Topic.cached(:weekly_popular)

Adding date to cache key with alias_method_chain:

def self.cache_key_with_date(id)
  ...
end

class << self
  alias_method_chain :cache_key, :date
end

Cached loads by ID: Topic.find(1, 2, 3) moves to Topic.get_cache(1, 2, 3), which can parallelize calls to memcached and bring them back as they’re ready.

user_ids = @topic.posts.map(&:user_id).uniq
@users = User.get_cache(user_ids)

You can also cache associations, so that you’re navigating associations via Memcache.

Cache overrides

class ApplicationController < ActionController::Base
  before_filter :set_cache_override
  def set_cache_override
    ActsAsCached.skip_cache_gets = !!params[:skip_cache]
  end
end

reset_cache: Slow, uncached operations can sometimes queue up and wedge a site. Instead, issue cache resets on completion of a request, rather than expiring beforehand. That way, requests that continue to pile up will still use the cached copy until the rebuild is complete.

class Presentation < ActiveRecord::Base
  after_save :reset_cache
end

Versioning: a way to expire cache on new code releases

class Presentation < ActiveRecord::Base
  acts_as_cached :version => 1
end

Deployment: Chris recommends using Monit to ensure your Memcache servers are up.

libketama: consistent hashing that gives you the ability to redeploy Memcache servers without invalidating all the keys.

Q: Page caching? A: Nginx with native Memcache page caching, but outside of Rails domains.

Lots of other questions, but dude, Chris talks too fast!

Tags ,  | 2 comments | no trackbacks

RailsConf 2007: Bradley Taylor: Virtual Clusters

Posted by Nick Sieger Sat, 19 May 2007 19:30:06 GMT

How does Rails figure into virtualization? Bradley will cover this topic with examples and case studies. Along the way, hardware items may be mentioned, but are not critical. Really, it’s about the design of the clusters, not the bits of plumbing you use to connect them up.

Virtualization is partitioning of physical servers that allow you to run multiple servers on it. Xen, Virtuozzo, VMWare, Solaris containers, KVM, etc. Bradley uses Xen. The virtual servers share the same processor (hopefully multi-core), memory, storage, network cards (but with indepenent IP addresses), etc., but run independently of each other. VPS, slice, container, accelerator, VM, it’s all the same. Memory, storage, and CPU can be guaranteed with the virtualization layer.

Why would you do this? Consolidate servers for less hardware and cost; Isolate applications -- bad apps don’t drag the server down, contain intrusions, use different software stacks; Replicate -- easily create new servers and deploy in a standardized and automated way; Utilize -- take advantage of all CPU, memory, storage, resources; Allocate resources, give a server exactly what it requires, grow/shrink up and down, and balance them. Bradley says, “Once you go to virtualization you won’t want to go back. Do the simplest thing that could possibly work.”

Virtual clusters, then, are a bunch of servers cooperating toward a common goal -- if you have many versions or copies of one thing. More than one customer, more than one version of software, etc.

For Rails, this means a lot of things: you can have many development environments and stages, take advantage of memory isolation, protect against PHP/Java, and make multiple-server scaling accessible.

Examples

  • Two servers for production and staging
  • Three for web/db/staging
  • Mixed languages -- instead of 1x1GB server use 3x300MB servers
  • High availability applications with fewer servers
  • Multiple applications -- one server per application
  • Standardized roles/appliances -- mail, ftp, dns, web, db

EastMedia

  • They can incubate customers in separate images
  • Dev/staging/production servers
  • Shared SVN/trac
  • 2 physical servers => 8 virtual servers

Boom Design

  • Again, multiple stages
  • Customer staging, with lower uptime requirements
  • Low-traffic apps on a single server, but everything else gets its own dedicated server
  • 2GB memory spread across 9 virtual servers

Tags ,  | no comments | no trackbacks

RailsConf 2007: Saturday Morning Keynotes

Posted by Nick Sieger Sat, 19 May 2007 17:22:42 GMT

Cyndi Mitchell -- ThoughtWorks Studios

Enterprise (the “e” word)

Before IT got involved, “enterprise” was a bold new venture. Toyota manufacturing, Skype disruption of telephony.

Enterprise in terms of IT has come to mean bloatware, incompetence, corruption, waste of time, no value.

So this is the battle: The enterprise (to boldly go where no man has gone before) we need to reclaim vs. the bloatware/competence/corruption/fear-based selling etc.

RubyWorks -- package stack with haproxy, mongrel, monit through an RPM repository

For JRuby support, call Ola.

Tim Bray -- Web Guy from Sun Microsystems

Change the world that are better than just using a cool web framework: http://pragmaticstudio.com/donate/

Sun loves Ruby. Ruby and Rails, that is. The impact of the Ruby language is going to be at least as big as Rails is for web development.

Sun provided servers for Ruby 2.0 development, and can provide servers for your potentially cool, worthy, open source project, just drop Tim an email.

A few more obligatory plugs for NetBeans and Sun sponsoring the conference. “Pre-alpha,” he says. Hmm, I wonder what Tor would say about that!

JRuby: when would you use JRuby vs. Ruby? If you have no pain, keep using C Ruby. But if you have management concerns, deployment concerns, etc. then by all means do try it!

Obligatory handshake/sandal connection with ThoughtWorks and Cyndi -- running Mingle (and cruisecontrol.rb) with JRuby.

Sun: “Hi, the answer is Java, what was the question?” So why would Sun want to support Ruby? Well, you guys are programmers. Programmers who deliver quality software fast. And those programmers need computers, and OSes, and web servers, and support and services, etc. Plug, plug, plug.

How do you make money on free products? Sun has open-sourcing Java, Solaris, even Sparc. Joyent is open-sourcing their stuff. Where does the money come from? 1. Adoption 2. Deployment 3. Monetization at the point of value

What if we win? Are our problems over? No, we’ll have to deal with Java. And .NET. And PHP. From the audience: And COBOL. The Network Is The Computer. The Network Is Heterogeneous. Deal with it. So how do we interoperate?

  • Just Run Java (and JRuby, of course!, and JavaScript, and PHP, etc.)
  • Use Atom/REST. Everything should have a publish button. Don’t use WS-DeathStar or WCF or WSIT.

Developer issues: Scaling, Static vs. Dynamic, Maintainability, Concurrency, Tooling, Integration, Time to Market. Which two of these matter the most?

Tim’s final assertion: Maintainability and Time to Market, and that’s why we’re all at RailsConf.

Tags ,  | 2 comments | no trackbacks

RailsConf 2007: Evan Weaver: Going Off Grid

Posted by Nick Sieger Fri, 18 May 2007 19:33:31 GMT

Evan is talking about leaving Rails as a full-stack framework and remixing bits and pieces for integration projects. He’s doing it in the context of a case study on Bio: a project at the University of Delaware working with DNA data in large SQL databases. Evan states that all of bioinformatics is an integration problem. (Me: That’s probably true of any research project where data is coming from multiple, varied sources. So where does Rails fit in this?)

So how do you cope with this? Use the Rails console as an admin interface, mapping AR onto the legacy schema.

Shadow (gem install shadow) is a REST-ful record server -- a small Mongrel handler that allows you to manipulate the database remotely. It uses dynamic ActiveRecord classes that are created and trashed for each request.

Parallelization -- uses the Sun 1 grid engine that distributes shell scripts across 128 nodes. Used for job and backend processing.

bioruby/bioperl/biopython -- bioinformatics libraries in other languages -- bioruby is not complete, but we still want to use Ruby, so he looked at ways of integrating Ruby with other languages. No RubyInline for Perl or Python, no up-to-date direct/C bindings. He ended up building a socket-level interface into python.

Admin tools to consider -- streamlined, active_scaffold, autoadmin, Django (manage.py inspectdb; manage.py syncdb; manage.py runserver). (Wow, come to RailsConf, get a Django demo. Unexpected surprise!)

Extending Rails -- has_many_polymorphs for easy creation directed graphs

Frustrating AR tidbits: has_many_through has a huge case statement, with sql strings everywhere, and tightly intertwined classes. Ugh.

Scaling big webapps: AR/SQL is not the way. Instead, go to a hyper-denormalized model, where the DB is just a big hash. This leads to things like berkeleydb, memcached, madeleine, etc. and MySQL just becomes a persistence store for memcache. One key is moving joins at write-time, so that reads don’t need to re-join associations. You’re essentially duplicating/caching the data out to each association, but this makes sharding/splitting of data easier. Example: Flickr user photos vs. photos placed in a group.

Evan doesn’t believe that SQL is a viable data store for webapps -- I think he means large-scale webapps. Not everyone who’s trying to build a web application will run into these kinds of issues, so your mileage may vary. Still, it’s refreshing to see more people rebel against the incumbent 30-year gorilla of SQL.

Tags ,  | no comments | no trackbacks

RailsConf 2007 Opening Keynote: David Heinemeier Hansson

Posted by Nick Sieger Fri, 18 May 2007 17:10:39 GMT

Rails 2.0

Where we’ve been

David is surprised and proud of the community that we already have, and wants us to be comfortable with where we are, and not always looking toward the future. We have:

  • Million gem downloads
  • Hundreds of plugins
  • 10k users on the rubyonrails-talk mailing list
  • Ruby job descriptions (asking for 3 years RoR experience, longer than David)
  • Books, books, books (and not just English books, but non-English titles as well), surpassing VBA, Perl, and Python in book sales
  • IDEs from NetBeans, Borland, Aptana, etc.

Rails 2.0 is not going to be the “Unicorn”. It’s not going to be a total rewrite, it actually has a release schedule, it will not break backwards-compatibility. Instead, it will build upon what we already have, and continue the philosophy of building on what is useful and needed. In fact, 95% of what’s in 2.0 already works today, in the edge. Example, a simple controller that handles three formats of input/output, with a person resource for accessing the data from a remote server.

class PeopleController < ApplicationController
  ...
  def create
    @person = Person.create(...)
    respond_to do |format|
      format.html { redirect_to person_url(@person) }
      format.xml  { render :status => :created, :location => person_url(@person), ... }
      format.js {
        render :update do |js|
          ...
        end
      }
    end
  end
end

class Person < ActiveResource::Base
  self.site = "http://example.com/"
end

David then goes into a live demo of the new scaffold resource, which by appearance is identical to the old scaffolding, except it comes pre-baked with a REST-ful XML interface. He then adds support for a text format with a couple of lines of code, jumps into IRB, defines an active resource, and proceeds to change the data remotely.

If you want to add search to your controller, you can do it in a DRY way, and all the format/view work you’ve done will benefit:

class PeopleController < ApplicationController
  def index
    if params[:name]
      @people = Person.find :all, :conditions => ["name like ?", "#{params[:name]}%"]
    else
      @people = Person.find :all
    end
    ...
  end
end

David points out that 37signals, Shopify, Fluxiom, et. al. are real sites, with non-trivial domains that are still well executed in Rails, so it’s not just about simple scaffolding demos.

In Rails 2.0, ActiveResource will be bundled with Rails, and ActionWebService will not.

Friends of Rails

  • AJAX!
  • REST!
  • Atom? -- Atom should be more native to Rails
  • Openid? -- Openid is not necessarily something that needs to be used by all, but still a strong ally.

9 other things I like about Rails 2

  • Breakpoints are back -- no longer depends on Binding.of_caller; instead Rails depends and builds upon ruby-debug by Kent Sibilev.
  • HTTP Performance -- streamlining .js and .css, even though it feels better to break up Javascript and CSS into many little pieces, and gzip them
<%= javascript_include_tag :all, :cache => true %>
<%= stylesheet_link_tag :all, :cache => true %>

We can also fake out the browser and configure multiple asset hosts (4) you can maximize browser connections

config.action_controller.asset_host = 'assets%d.highrisehq.com'
  • Query cache
ActiveRecord::Base.cache do
  # actions here are cached
end
  • Rendering and MIME types -- bake the MIME convention into the template, and separate from the rendering mechanism people/index.html.erb people/index.xml.builder people/index.rss.erb people/index.atom.builder
  • config/initializers replacing config/environment. Initializers are .rb files in the config/initializers directory of your app that are automatically loaded during initialization time.
  • Sexy migrations
create_table :people do |t|
  t.integer :account_id
  t.string :first_name, :last_name, :null => false
  t.text :description
  t.timestamps
end
  • HTTP authentication (authenticate_or_request_with_http_basic, authenticate_with_http_basic)
  • The MIT assumption -- the licensing question -- make it easier to understand
  • Spring cleaning -- getting rid of the cruft -- stay tuned!

Tags ,  | 4 comments | no trackbacks

RailsConf Releases

Posted by Nick Sieger Thu, 17 May 2007 17:35:00 GMT

Just a quick update. Firstly, I just released ci_reporter 1.3; it should be available in the gem index shortly. Thanks to Bret Pettichord, Jeremy Beheler, and Charlie Kunz for reporting issues and prodding me to fix a couple of bugs. The two new items in this release are:

  • RSpec 0.9/trunk-compatible. You can now describe/it all you want with ci_reporter.
  • Errors and failure stack traces now include the full error message and exception type.

Secondly, JRuby 1.0RC2 has been released. Although there is no official release announcement at the moment, it is available for download and has been propagated to the central Maven repository also. Please do check it out and let us know on the mailing lists or in JIRA if you come across any blocker issues or regressions. Just a couple more weeks of stabilization; expect a rockin’ 1.0 release in June!

Lastly, expect an ActiveRecord-JDBC 0.3.2 release Real Soon Now.

Tags , , , ,  | no comments | no trackbacks