[antlr-interest] Looking for reference to how ANTLR performs ... special example will not work???
Jim Idle
jimi at temporal-wave.com
Fri Sep 11 07:38:15 PDT 2009
On 09/11/2009 07:20 AM, Sylvain, Gregory [USA] wrote:
> Great replies thank you, I was assumed the longest-match wins rules
> applied, but I wasn't sure - thanks.
> Here is an example of the sort of problems I am trying to figure out.
> r : 'BEGIN/' f1=(number 'T') f2=field EOT EOL
> number : INT | FLOAT ;
> field : ALPHANUM_CHAR+;
> ALPHANUM_CHAR : ( ALPHA_CHAR | DIGIT | SPECIAL_CHAR | ' ')+;
> INT : DIGIT+ ;
> FLOAT : DIGIT+ '.' DIGIT+;
> fragment DIGIT : '0' .. '9' ;
> fragment ALPHA_CHAR : 'A' .. 'Z' ;
> SPECIAL_CHAR: ( ',' | '(' | ')' | '\\' ); // more special chars can
> be added here....
> EOT : '//'
> EOL : '\n';
>
Here, the three rules ALPHANUM_CHAR, INT and FLOAT can all match a
DIGIT. This is completely ambiguous and I imagine you are getting
warnings about this when you build/debug the grammar? The simlest chsnge
would be to make ALPHANUM_CHAR a parser rule (make it lower case), then
left factor FLOAT like this:
fragment INT : ; // To provide a token type
FLOAT : DIGIT+ ( ('.' DIGIT)=> '.' DIGIT+ | {$type = INT;} ) ;
Next, remove all the 'LITERALS' from your parser and code them in the
lexer. As a beginner you will get thrown off if the literals happen to
clash with any of your lexer rules. With more experience this confusion
won't happen and you can use 'literals' if you prefer them (but I don't
generally - see past posts about this topic).
Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090911/9754249b/attachment.html
More information about the antlr-interest
mailing list