Nick Sieger: RubyConf: Rinda in the Real World tag:blog.nicksieger.com,2005:Typo Typo 2010-11-22T18:11:21+00:00 Nick Sieger urn:uuid:ec8eface-3fb9-477d-8ad8-e91e4c5c9a1d 2006-10-21T19:01:35+00:00 2010-11-22T18:11:21+00:00 RubyConf: Rinda in the Real World <p><em>Speaker: <a href="http://www.vanderburg.org/">Glenn Vanderburg</a></em></p> <p>Rinda is a distributed coordination system, by Masatoshi SEKI, based on work by David Gelernter (called Linda)&#46; It&#8217;s similar to JavaSpaces, but more in the Ruby spirit&#46; It uses DRb for communication (also by Seki&#45;san)&#46;</p> <p>There are several existing tutorials on Rinda, but none with a broad, real&#45;world applicability&#46; As Glenn started to go deeper with Rinda, flaws began to be exposed&#46;</p> <h2>Rinda Basics</h2> <ul> <li>The <strong>TupleSpace</strong> is in the conceptual middle, as a bag of tuples &#8220;arrays&#8221;&#46; Participants can write tuples into the space, or take them out (usually according to a set of conditions)&#46;</li> <li><strong>Tuples</strong> are usually requests to do some work by some unknown requester, or responses to that work&#46;</li> <li>A <strong>RingServer</strong> is a broadcast/multicast lookup service for finding Rinda tuplespaces&#46;</li> <li><strong>Templates</strong> are used to find tuples to take&#46; <code>Template#match</code> requires that the tuples have the same length, and that all the non&#45;nil elements are triple&#45;equal (<code>===</code>)&#46;</li> <li>In practice arrays are always used in the <code>#take</code> and <code>#write</code> methods, and they&#8217;re implicitly converted to the appropriate object&#46;</li> </ul> <h2>Protocol design</h2> <p>When you&#8217;re deciding what to store/read in the tuples, you&#8217;re essentially designing a communication protocol&#46; So you need to take the extra precautions required as you would when designing any API or protocol, including documentation, evolution of changes, process workflow, etc&#46;</p> <ul> <li><em>Parts of tuples:</em> command/request, identifiers, and associated parameters/data&#46; Usually templates will match on the command, sometimes on the ID, but never on the data&#46; </li> <li>Strings work well in tuples, because you can use Regexps to select them&#46;</li> <li>Numbers work well, because you can match them with Ranges</li> <li>Symbols don&#8217;t work so well (at least until <a href="http://redhanded.hobix.com/inspect/SymbolIs_aString.html"><code>Symbol &lt; String</code></a>)</li> <li>Communication patterns &#45;&#45; synchronous communication is not a good fit for a Rinda architecture (e&#46;g&#46;, a request for status)</li> </ul> <h2>Deployment</h2> <p>DRb does not marshal code/behavior (unlike Java and RMI)&#46; This is a limitation that forces you to consider how to use custom objects in the tuples, because the code must be shared (and thus jointly upgraded) across all participants&#46;</p> <p>Multiple processes are probably more reliable for the various components of the architecture (RingServer, TupleSpace, clients) rather than using green threads, although this is just a hypothesis&#46;</p> <p>The TupleSpace is the single point of failure &#45;&#45; if the process with it crashes, when it starts back up, it&#8217;s likely to have a new DRb URL, so it&#8217;s helpful to have a proxy around the tuple space in clients that can rediscover it if it crashes&#46;</p> <p>Be wary of multiple ring servers on the same subnet! Behavior may be unexpected&#46; You may notbe using the tuple space that you wanted, and when the other ring server goes away, so does your tuple space&#46;</p> <p>DRb does not have any security built into it, such as unforgeable object IDs, encrypted transport, authentication, etc&#46; This can be a problem in some situations&#46;</p> <p>There is no persistence by default, so consider adding some for crash resistance, or deal with occasional loss of tuples&#46;</p> <p>There are no transactions, so the requester will never know if it still processing or was lost&#46;</p> <p>As Ruby matures and gains exposure, some libraries that have been good enough for a while may need reconsideration&#46; As a case in point, Rinda and DRb haven&#8217;t been updated since February of 2004&#46;</p> <h2>Q &amp; A</h2> <p><em>Q&#46; Are write/take atomic?</em> Yes, if you timeout, the tuple will still be there&#46; Two workers cannot get the same tuple&#46;</p> <p><em>Q&#46; Where would you use this instead of traditional message queues (reinvent the wheel)?</em> There&#8217;s more flexibility in this architecture&#46; (Justin Gehtland) In this particular case, they have a JMS backend, but the Ruby code could not assume connectivity to that backend&#46;</p> <p><em>Q&#46; Is there a common correlator design pattern?</em> Tie response types to request types to limit the pool of potential matches&#46; In the situation where there is a single requester, it&#8217;s simple to just generate sequential numbers&#46; In a P2P situation, you may need to use a hostname or PID to help distinguish&#46; [Sounds like a GUID/UUID system to me&#46;]</p> <p><em>Q&#46; Have you used expiry dates to grab abandoned tuples?</em> No, sounds like an interesting possibility&#46;</p>