Nick Sieger: RailsConf 2007: Evan Weaver: Going Off Grid do what you love tag:blog.nicksieger.com,2005:Typo Typo 2007-07-13T09:45:27+00:00 Nick Sieger urn:uuid:f408e374-3a8c-4436-b31f-cf1d3fe7e4be 2007-05-18T19:33:31+00:00 2007-07-13T09:45:27+00:00 RailsConf 2007: Evan Weaver: Going Off Grid <p>Evan is talking about leaving Rails as a full-stack framework and remixing bits and pieces for integration projects. He&#8217;s doing it in the context of a case study on Bio: a project at the University of Delaware working with DNA data in large SQL databases. Evan states that all of bioinformatics is an integration problem. (Me: That&#8217;s probably true of any research project where data is coming from multiple, varied sources. So where does Rails fit in this?)</p> <p>So how do you cope with this? Use the Rails console as an admin interface, mapping AR onto the legacy schema.</p> <p>Shadow (<code>gem install shadow</code>) is a REST-ful record server &#8211; a small Mongrel handler that allows you to manipulate the database remotely. It uses dynamic ActiveRecord classes that are created and trashed for each request.</p> <p>Parallelization &#8211; uses the Sun 1 grid engine that distributes shell scripts across 128 nodes. Used for job and backend processing.</p> <p>bioruby/bioperl/biopython &#8211; bioinformatics libraries in other languages &#8211; bioruby is not complete, but we still want to use Ruby, so he looked at ways of integrating Ruby with other languages. No RubyInline for Perl or Python, no up-to-date direct/C bindings. He ended up building a socket-level interface into python.</p> <p>Admin tools to consider &#8211; streamlined, active_scaffold, autoadmin, Django (<code>manage.py inspectdb; manage.py syncdb; manage.py runserver</code>). (Wow, come to RailsConf, get a Django demo. Unexpected surprise!)</p> <p>Extending Rails &#8211; <code>has_many_polymorphs</code> for easy creation directed graphs</p> <p>Frustrating AR tidbits: <code>has_many_through</code> has a huge case statement, with sql strings everywhere, and tightly intertwined classes. Ugh.</p> <p>Scaling big webapps: AR/SQL is not the way. Instead, go to a hyper-denormalized model, where the DB is just a big hash. This leads to things like berkeleydb, memcached, madeleine, etc. and MySQL just becomes a persistence store for memcache. One key is moving joins at write-time, so that reads don&#8217;t need to re-join associations. You&#8217;re essentially duplicating/caching the data out to each association, but this makes sharding/splitting of data easier. Example: Flickr user photos vs. photos placed in a group.</p> <p>Evan doesn&#8217;t believe that SQL is a viable data store for webapps &#8211; I think he means large-scale webapps. Not everyone who&#8217;s trying to build a web application will run into these kinds of issues, so your mileage may vary. Still, it&#8217;s refreshing to see more people rebel against the incumbent 30-year gorilla of SQL.</p>