[antlr-interest] ANTLR v3 ruby grammar

Sun Nov 27 13:07:40 PST 2005

On 11/27/05, Terence Parr <parrt at cs.usfca.edu> wrote:
> BTW, I'm working on a Ruby grammar as a means of testing and making
> ANTLR more friendly; right now it crashes with any bad input.  If
> anybody wants to help with the ruby grammar, you can lurk/join-in on
> the list:
>
> http://rubyforge.org/mailman/listinfo/rubygrammar-grammarians
>
> and here is the rubyforge project:
>
> http://rubyforge.org/projects/rubygrammar/
>
> Ter
>

I am quite interested.  I'm also try to get a ruby parser working in a
LL parser (my grammar project).  I think it would be quite useful to
collaborate.  Here are a few questions I have:

- What are you using as a spec?  parse.y from ruby 1.8?  This is what I'm using.

- Are you going to have the parser control the lexer state as it is
done in parse.y?  I'm doing this and possibly refactoring a little.
The big downside is that the parser and lexer have to stay
synchronized making a multi-threaded solution be slow - mine.  I'll
redo mine at a certain point to not be multi-threaded.

- Do heredocs look feasible?  I think this is the toughest lexer
feature.  My current plan is to read/store the rest of the line, lex
the heredoc, and push ("unread") the stored rest of line back to the
input cursor so that it reads it next.  I'll need to use an
appropriate buffering cursor to be able to do this (usually my lexers
go almost directly from the IO - just needing IO#getc).

- Are you planning on doing a complete LR to LL conversion or do
something else where it gets messy?  I believe in some places doing a
complete conversion may make portions of the grammar grow unwieldy.
An example is a LHS vs an expression.  With LR parsing, you can figure
out whether something is assignable by looking at the right side of
the expression.  Currently, my plan is to pass around some parser
state where necessary.  For example, when parsing an expression, I'll
return whether the expression is assignable or not.

- What's the status?  With mine, I have a lexer with all the states
(but the parser isn't manipulating them yet).  The lexer is fairly
complete with the exception of various quoting mechanisms (heredocs
are missing completely).  I'm still working through the parser to
convert from LR to LL.

- Are you planning on doing ruby code generation for ANTLR?

Good luck!  Ruby is a very tough think to parse in my opinion.

Eric