[antlr-interest] Look-ahead problem parsing phrase?

Sun Jun 28 14:13:58 PDT 2009

Thanks for the reply Nick!

I tried moving the order around, but everything I've tried so far resulted
in similar errors are parse-time.  If I move things around too much, I get
errors generating the lexer and parser.  I've come across so many errors
from attempting so many combinations, I'm not sure when I'm moving forward
or back on the matter.

I may have simply overlooked some key documentation that would help me
understand what the issue is.  I'm just not sure what I've missed.  Any
suggestions?

Sean

On Sun, Jun 28, 2009 at 1:52 PM, Nick Vlassopoulos
<nvlassopoulos at gmail.com>wrote:

> Hi Sean,
>
> I am not sure about this, but I think you have to rearrange the order of
> the rules. More specifically,
> antlr will use rules in the order they appear, therefore, WS should
> probably near the end of your list.
> Maybe something like this would work:
>
> grammar Phrase;
>
> line : WS? PHRASE EOL?;
>
> PHRASE : WORD (WS WORD)*;
>
> WORD : (LETTER|DIGIT)+;
>
> EOL : WS* NEWLINE*;
> LETTER : ('a'..'z'|'A'..'Z');
> DIGIT : ('0'..'9');
> NEWLINE : '\r'? '\n';
> WS : (' '|'\t')+;
>
> Further, you should probably add a rule for EOF, maybe something like
>
> line : WS? PHRASE EOL? EOF?;
>
> (Note: I haven't tested what I suggest above)
> Hope this helps.
>
> Nikos
>
> On Sun, Jun 28, 2009 at 6:54 PM, Sean O'Dell <sean at celsoft.com> wrote:
>
>> Hi everyone,
>>
>> I'm new to the mailing list and am just getting starting with ANTLR (day
>> 2) and I've run into an issue I'm having some trouble wrapping my head
>> around.  I think it's related to how ANTLR looks-ahead to predict tokens,
>> but I think I'm overlaying my familiarity with regular expressions onto
>> ANTLR in a way that is clouding up my understanding of what ANTLR does and
>> needs in order to do this right.
>>
>> I'm trying to parse out a collection of "words" on a single line as a
>> "phrase", ignoring whitespace at the beginning and end of the lines, but
>> I'm getting an error while parsing what I think is a line that should match
>> the grammar.
>>
>> My grammar:
>>
>>     grammar Phrase;
>>
>>     WS : (' '|'\t')+;
>>     DIGIT : ('0'..'9');
>>     LETTER : ('a'..'z'|'A'..'Z');
>>     NEWLINE : '\r'? '\n';
>>
>>     WORD : (LETTER|DIGIT)+;
>>
>>     EOL : WS? NEWLINE?;
>>
>>     PHRASE : WORD (WS WORD)*;
>>
>>     line : WS? PHRASE EOL?;
>>
>> The line of text I am parsing (note whitespace at the ends): " This is a
>> phrase "
>>
>> Error I get during parse:
>>
>>     line 1:18 required (...)+ loop did not match anything at character
>> '<EOF>'
>>     line 0:-1 missing PHRASE at '<EOF>'
>>
>> What might be causing this error, and what might be a good, clean way to
>> parse out the "phrase" in the input text?
>>
>> Sean O'Dell
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090628/28e1d791/attachment.html