Nick Sieger: RailsConf 2007: Evan Weaver: Going Off Grid http://blog.nicksieger.com/articles/2007/05/18/railsconf-2007-evan-weaver-going-off-grid en-us 40 RailsConf 2007: Evan Weaver: Going Off Grid <p>Evan is talking about leaving Rails as a full&#45;stack framework and remixing bits and pieces for integration projects&#46; He&#8217;s doing it in the context of a case study on Bio: a project at the University of Delaware working with DNA data in large SQL databases&#46; Evan states that all of bioinformatics is an integration problem&#46; (Me: That&#8217;s probably true of any research project where data is coming from multiple, varied sources&#46; So where does Rails fit in this?)</p> <p>So how do you cope with this? Use the Rails console as an admin interface, mapping AR onto the legacy schema&#46;</p> <p>Shadow (<code>gem install shadow</code>) is a REST&#45;ful record server &#45;&#45; a small Mongrel handler that allows you to manipulate the database remotely&#46; It uses dynamic ActiveRecord classes that are created and trashed for each request&#46;</p> <p>Parallelization &#45;&#45; uses the Sun 1 grid engine that distributes shell scripts across 128 nodes&#46; Used for job and backend processing&#46;</p> <p>bioruby/bioperl/biopython &#45;&#45; bioinformatics libraries in other languages &#45;&#45; bioruby is not complete, but we still want to use Ruby, so he looked at ways of integrating Ruby with other languages&#46; No RubyInline for Perl or Python, no up&#45;to&#45;date direct/C bindings&#46; He ended up building a socket&#45;level interface into python&#46;</p> <p>Admin tools to consider &#45;&#45; streamlined, active_scaffold, autoadmin, Django (<code>manage.py inspectdb; manage.py syncdb; manage.py runserver</code>)&#46; (Wow, come to RailsConf, get a Django demo&#46; Unexpected surprise!)</p> <p>Extending Rails &#45;&#45; <code>has_many_polymorphs</code> for easy creation directed graphs</p> <p>Frustrating AR tidbits: <code>has_many_through</code> has a huge case statement, with sql strings everywhere, and tightly intertwined classes&#46; Ugh&#46;</p> <p>Scaling big webapps: AR/SQL is not the way&#46; Instead, go to a hyper&#45;denormalized model, where the DB is just a big hash&#46; This leads to things like berkeleydb, memcached, madeleine, etc&#46; and MySQL just becomes a persistence store for memcache&#46; One key is moving joins at write&#45;time, so that reads don&#8217;t need to re&#45;join associations&#46; You&#8217;re essentially duplicating/caching the data out to each association, but this makes sharding/splitting of data easier&#46; Example: Flickr user photos vs&#46; photos placed in a group&#46;</p> <p>Evan doesn&#8217;t believe that SQL is a viable data store for webapps &#45;&#45; I think he means large&#45;scale webapps&#46; Not everyone who&#8217;s trying to build a web application will run into these kinds of issues, so your mileage may vary&#46; Still, it&#8217;s refreshing to see more people rebel against the incumbent 30&#45;year gorilla of SQL&#46;</p> Fri, 18 May 2007 19:33:31 +0000 urn:uuid:f408e374-3a8c-4436-b31f-cf1d3fe7e4be Nick Sieger http://blog.nicksieger.com/articles/2007/05/18/railsconf-2007-evan-weaver-going-off-grid railsconf railsconf2007 http://blog.nicksieger.com/articles/trackback/243