[antlr-interest] Invalid parser generation
Stefan Mätje
Stefan.Maetje at esd-electronics.com
Tue Sep 4 06:38:55 PDT 2012
Am 04.09.2012 14:35, schrieb mark4 at voila.fr:> Hi Stefan,
>
> Thanks for your reply. I didn't understand the difference between
> lexer rules and parser rules because,
> in fine, a parser rule will always resolve in a series of lexer
> rules...
Please don't mix the lexer and the parser phase in your mind. The lexer
deals with single characters and groups them into tokens.
The parser doesn't know anything about single characters and deals only
with tokens.
> Anyway, I applied the modification but I now get an error:
>
> COMPTE : ('0'..'9')+;
>
> ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
>
> The following token definitions can never be matched because prior
> tokens match the same input: COMPTE,ID
You have rules in your grammar before COMPTE and ID that define a
superset of the character sequences that COMPTE and ID can match.
> Well, I have several entities in my grammar that have different
> encoding forms, so how can I specify them one after the other?
If at the end one type of token should be produced all needed
regular expressions have to go into one rule.
> Thanks,
> Mark
>
As rule of thumb write the most specific lexer rules first and then
follow them with the less specific rules. The lexer will give the
rules first written a higher precedence.
So put your keywords first (which are fixed strings). Then follow them
with something like operators (also fixed strings). At the lower level
rules that can match different strings like ID and COMPTE follow.
See what Antlrworks tells you about multiple matches and which rules are
involved.
Don't know if this may help but the rule that matches both COMPTE and ID
would be most interesting.
Best regards,
Stefan
PS.: Please reply also to the list.
More information about the antlr-interest
mailing list