[antlr-interest] Looking for reference to how ANTLR performs ... special example will not work???

Jim Idle jimi at temporal-wave.com
Fri Sep 11 07:38:15 PDT 2009


On 09/11/2009 07:20 AM, Sylvain, Gregory [USA] wrote:
> Great replies thank you, I was assumed the longest-match wins rules 
> applied, but I wasn't sure - thanks.
> Here is an example of the sort of problems I am trying to figure out.
> r            : 'BEGIN/' f1=(number 'T') f2=field EOT EOL
> number : INT | FLOAT ;
> field      : ALPHANUM_CHAR+;
> ALPHANUM_CHAR : ( ALPHA_CHAR  | DIGIT | SPECIAL_CHAR | ' ')+;
> INT : DIGIT+ ;
> FLOAT : DIGIT+ '.' DIGIT+;
> fragment DIGIT : '0' .. '9' ;
> fragment ALPHA_CHAR : 'A' .. 'Z' ;
> SPECIAL_CHAR: ( ',' | '(' | ')' | '\\' );  // more special chars can 
> be added here....
> EOT : '//'
> EOL : '\n';
>
Here, the three rules ALPHANUM_CHAR, INT and FLOAT can all match a 
DIGIT. This is completely ambiguous and I imagine you are getting 
warnings about this when you build/debug the grammar? The simlest chsnge 
would be to make ALPHANUM_CHAR a parser rule (make it lower case), then 
left factor FLOAT like this:

fragment INT : ; // To provide a token type
FLOAT : DIGIT+ ( ('.' DIGIT)=> '.' DIGIT+  | {$type = INT;} ) ;


Next, remove all the 'LITERALS' from your parser and code them in the 
lexer. As a beginner you will get thrown off if the literals happen to 
clash with any of your lexer rules. With more experience this confusion 
won't happen and you can use 'literals' if you prefer them (but I don't 
generally - see past posts about this topic).

Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090911/9754249b/attachment.html 


More information about the antlr-interest mailing list