RubyConf: Rinda in the Real World

Posted by Nick Sieger Sat, 21 Oct 2006 19:01:35 GMT

Speaker: Glenn Vanderburg

Rinda is a distributed coordination system, by Masatoshi SEKI, based on work by David Gelernter (called Linda). It’s similar to JavaSpaces, but more in the Ruby spirit. It uses DRb for communication (also by Seki-san).

There are several existing tutorials on Rinda, but none with a broad, real-world applicability. As Glenn started to go deeper with Rinda, flaws began to be exposed.

Rinda Basics

  • The TupleSpace is in the conceptual middle, as a bag of tuples “arrays”. Participants can write tuples into the space, or take them out (usually according to a set of conditions).
  • Tuples are usually requests to do some work by some unknown requester, or responses to that work.
  • A RingServer is a broadcast/multicast lookup service for finding Rinda tuplespaces.
  • Templates are used to find tuples to take. Template#match requires that the tuples have the same length, and that all the non-nil elements are triple-equal (===).
  • In practice arrays are always used in the #take and #write methods, and they’re implicitly converted to the appropriate object.

Protocol design

When you’re deciding what to store/read in the tuples, you’re essentially designing a communication protocol. So you need to take the extra precautions required as you would when designing any API or protocol, including documentation, evolution of changes, process workflow, etc.

  • Parts of tuples: command/request, identifiers, and associated parameters/data. Usually templates will match on the command, sometimes on the ID, but never on the data.
  • Strings work well in tuples, because you can use Regexps to select them.
  • Numbers work well, because you can match them with Ranges
  • Symbols don’t work so well (at least until Symbol < String)
  • Communication patterns -- synchronous communication is not a good fit for a Rinda architecture (e.g., a request for status)

Deployment

DRb does not marshal code/behavior (unlike Java and RMI). This is a limitation that forces you to consider how to use custom objects in the tuples, because the code must be shared (and thus jointly upgraded) across all participants.

Multiple processes are probably more reliable for the various components of the architecture (RingServer, TupleSpace, clients) rather than using green threads, although this is just a hypothesis.

The TupleSpace is the single point of failure -- if the process with it crashes, when it starts back up, it’s likely to have a new DRb URL, so it’s helpful to have a proxy around the tuple space in clients that can rediscover it if it crashes.

Be wary of multiple ring servers on the same subnet! Behavior may be unexpected. You may notbe using the tuple space that you wanted, and when the other ring server goes away, so does your tuple space.

DRb does not have any security built into it, such as unforgeable object IDs, encrypted transport, authentication, etc. This can be a problem in some situations.

There is no persistence by default, so consider adding some for crash resistance, or deal with occasional loss of tuples.

There are no transactions, so the requester will never know if it still processing or was lost.

As Ruby matures and gains exposure, some libraries that have been good enough for a while may need reconsideration. As a case in point, Rinda and DRb haven’t been updated since February of 2004.

Q & A

Q. Are write/take atomic? Yes, if you timeout, the tuple will still be there. Two workers cannot get the same tuple.

Q. Where would you use this instead of traditional message queues (reinvent the wheel)? There’s more flexibility in this architecture. (Justin Gehtland) In this particular case, they have a JMS backend, but the Ruby code could not assume connectivity to that backend.

Q. Is there a common correlator design pattern? Tie response types to request types to limit the pool of potential matches. In the situation where there is a single requester, it’s simple to just generate sequential numbers. In a P2P situation, you may need to use a hostname or PID to help distinguish. [Sounds like a GUID/UUID system to me.]

Q. Have you used expiry dates to grab abandoned tuples? No, sounds like an interesting possibility.

Tags ,  | no comments | no trackbacks

Comments

Trackbacks

Use the following link to trackback from your own site:
http://blog.nicksieger.com/articles/trackback/89