[antlr-interest] Invalid parser generation

Stefan Mätje Stefan.Maetje at esd-electronics.com
Tue Sep 4 06:38:55 PDT 2012


Am 04.09.2012 14:35, schrieb mark4 at voila.fr:> Hi Stefan,
 >
 > Thanks for your reply. I didn't understand the difference between
 > lexer rules and parser rules because,
 > in fine, a parser rule will always resolve in a series of lexer
 > rules...

Please don't mix the lexer and the parser phase in your mind. The lexer 
deals with single characters and groups them into tokens.

The parser doesn't know anything about single characters and deals only 
with tokens.

 > Anyway, I applied the modification but I now get an error:
 >
 > COMPTE : ('0'..'9')+;
 >
 > ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
 >
 > The following token definitions can never be matched because prior
 > tokens match the same input: COMPTE,ID

You have rules in your grammar before COMPTE and ID that define a 
superset of the character sequences that COMPTE and ID can match.

 > Well, I have several entities in my grammar that have different
 > encoding forms, so how can I specify them one after the other?

If at the end one type of token should be produced all needed
regular expressions have to go into one rule.

 > Thanks,
 > Mark
 >

As rule of thumb write the most specific lexer rules first and then 
follow them with the less specific rules. The lexer will give the
rules first written a higher precedence.

So put your keywords first (which are fixed strings). Then follow them 
with something like operators (also fixed strings). At the lower level
rules that can match different strings like ID and COMPTE follow.

See what Antlrworks tells you about multiple matches and which rules are 
involved.

Don't know if this may help but the rule that matches both COMPTE and ID 
would be most interesting.

Best regards,
	Stefan

PS.: Please reply also to the list.



More information about the antlr-interest mailing list