[antlr-interest] Look-ahead problem parsing phrase?

Sun Jun 28 13:52:26 PDT 2009

Hi Sean,

I am not sure about this, but I think you have to rearrange the order of the
rules. More specifically,
antlr will use rules in the order they appear, therefore, WS should probably
near the end of your list.
Maybe something like this would work:

grammar Phrase;

line : WS? PHRASE EOL?;

PHRASE : WORD (WS WORD)*;

WORD : (LETTER|DIGIT)+;

EOL : WS* NEWLINE*;
LETTER : ('a'..'z'|'A'..'Z');
DIGIT : ('0'..'9');
NEWLINE : '\r'? '\n';
WS : (' '|'\t')+;

Further, you should probably add a rule for EOF, maybe something like

line : WS? PHRASE EOL? EOF?;

(Note: I haven't tested what I suggest above)
Hope this helps.

Nikos

On Sun, Jun 28, 2009 at 6:54 PM, Sean O'Dell <sean at celsoft.com> wrote:

> Hi everyone,
>
> I'm new to the mailing list and am just getting starting with ANTLR (day 2)
> and I've run into an issue I'm having some trouble wrapping my head around.
> I think it's related to how ANTLR looks-ahead to predict tokens, but I think
> I'm overlaying my familiarity with regular expressions onto ANTLR in a way
> that is clouding up my understanding of what ANTLR does and needs in order
> to do this right.
>
> I'm trying to parse out a collection of "words" on a single line as a
> "phrase", ignoring whitespace at the beginning and end of the lines, but
> I'm getting an error while parsing what I think is a line that should match
> the grammar.
>
> My grammar:
>
>     grammar Phrase;
>
>     WS : (' '|'\t')+;
>     DIGIT : ('0'..'9');
>     LETTER : ('a'..'z'|'A'..'Z');
>     NEWLINE : '\r'? '\n';
>
>     WORD : (LETTER|DIGIT)+;
>
>     EOL : WS? NEWLINE?;
>
>     PHRASE : WORD (WS WORD)*;
>
>     line : WS? PHRASE EOL?;
>
> The line of text I am parsing (note whitespace at the ends): " This is a
> phrase "
>
> Error I get during parse:
>
>     line 1:18 required (...)+ loop did not match anything at character
> '<EOF>'
>     line 0:-1 missing PHRASE at '<EOF>'
>
> What might be causing this error, and what might be a good, clean way to
> parse out the "phrase" in the input text?
>
> Sean O'Dell
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090628/01efa40e/attachment.html