[antlr-interest] Lexer for Ruby's heredoc syntax

Mon Apr 23 09:30:37 PDT 2012

You have to use some member variables in the lexer and hand craft a method
to do what you need. But you will also need to add code at the newline
rule that notices you have recorded the position of text to skip before
the next LA call. You need to mark the end of the last text you process.
You will also need to process an error when the delimiter is missing and
so on.

It is easy enough with a little thought, but to me it is another example
of Ruby's arbitrary syntax and functionality.

HEREDOC: '<<' { setText(processHere()); } ;
NL: '\n' { skipHereDoc(); } ;

You need to look at input.Mark() . skip() and .rewind (or is it release, I
cannot remember).

Having looked at parsing Ruby before, this is the least of your worries to
be honest - you are trying to be bug compatible with an interpreter and
there is almost always something that is impossible to determine at parse
time vs runtime.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Andrea Polci
> Sent: Monday, April 23, 2012 9:12 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexer for Ruby's heredoc syntax
>
> I'm trying to write a parser that allow the Ruby's heredoc syntax that
> allow to write something like this:
>
> method_name(<<DELIM, 123, <<OTHERDELIM)
>     This is the content for the
>     first argument of the method
> DELIM
>    This is the content for the third argument of the method OTHERDELIM
> otherMethod()
> ....
>
>
> The lexer should produce the following token stream:
> IDENTIFIER LPAREN HEREDOC COMMA INTEGER COMMA HEREDOC RPAREN EOL
> IDENTIFIER ...
>
> What the lexer should do when it found a heredoc tag (<<XXXX) is to
> mark the current position, skip to the following line, than consume all
> the characters until it finds the delimiter matching the tag. After
> that it should rewind to the mark previously set.
> The problem is that it should then skip all the lines part of the
> heredoc already analysed.
>
> Is there a way to do something similar?
> All I can think of is to wrap the input CharStream and allow to mark
> lines that have to be ignored by calls to input.consume() (and other
> methods of the CharStream interface).
>
> Thanks for any help.
>
> Andrea
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address