Visualization of Ruby's Grammar
Posted by Nick Sieger Fri, 27 Oct 2006 16:48:00 GMT
As part of the momentum surrounding the Ruby implementer’s summit, I have decided to take on a pet project to understand Ruby’s grammar better, with the goal of contributing to an implementation-independent specification of the grammar. Matz mentioned during his keynote how parse.y was one of the uglier parts of Ruby, but just how ugly?
Well, judge for yourself. Below is a grammar dependency graph generated using ANTLRWorks and GraphViz. The steps I took are as follows. I took parse.y, stripped all C definitions, code and actions from it to give a bare YACC definition. Next, I did the equivalent of
I haven’t even begun to absorb all the meanings from this picture, but one stark difference between Ruby and the other two is the node in the middle of the picture with a high concentration of outgoing edges. That node is called
primary in the grammar definition, and it is probably one of the reasons that Ruby syntax is so flexible and forgiving. A primary node’s direct children apparently represent a large portion of the syntax, and explain why in Ruby a single statement can either be a literal, a method invocation (or series of them), a standalone expression (such as
a < b), all the way up to larger syntactic groupings such as
if ... else ... end and
begin ... rescue ... end, among many others.
Generated from Java 1.5 grammar on antlr.org
Generated from ECMAScript grammar on antlr.org