[antlr-interest] Integer literal ending problem
Justin Murray
jmurray at aerotech.com
Tue Dec 13 05:50:05 PST 2011
Hello Anton,
Why are two tokens a problem in that case? That is exactly what your
lexer grammar dictates. If you want "123A" to error, make it error in
the parsing stage (not during lexing) by simply making sure that you
don't have a rule like:
myrule: INT ID;
If for some reason "123A" should be invalid, but "123 A" is ok, then you
will need to use whitespace as part of your grammar:
myrule: INT WS ID;
This is not typically how most languages work though, it is better if
whitespace can be ignored. Usually, some other delimiter should come
between an INT and an ID, such as an operator or a comma.
- Justin
On 12/13/2011 6:46 AM, Shevchenko A wrote:
> Hello,
>
> I am trying to write some tests for the lexical parser generated with ANTLR.
> My grammar is simple:
> INT: ('0'..'9')+;
> ID: ('A'..'Z') ('A'..'Z' | '0'..'9')* ;
> WS: (' ' | '\r' | '\n')* { skip(); };
>
> With such a grammar the parser will interpret the string "123A" as 2 tokens,
> and this is undesirable.
> If I specify that integer should be ended with whitespace another problem
> will come up. Not only whitespace is the ending but also all special
> characters.
>
> So, the question is about best practices to solve the problem.
> Thanks in advance.
>
> --
> Regards,
> Anton Shevchenko,
> 1C Company, Moscow.
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list