[antlr-interest] Lexical error recovery by manual symbol	(character) insertion/deletion?
    Darach Ennis 
    darach at gmail.com
       
    Fri Feb 15 17:36:59 PST 2008
    
    
  
Hi Gavin,
I think you've analysed this a lot more deeply than I have. Your
responses are being really helpful to increasing my understanding,
so thank you! :)
> I agree.  I find it a bit irritating that I can't say "I'm
> creating this rule just for convenience; it doesn't need a token
> type id".  Although I'd probably be happier with something like
> this:
>
> tokens { FLOAT; }          // imaginary: type id generated but no
> warning
> fragment DIGIT: '0'..'9';  // fragment: no type id generated
> fragment NUMBER: DIGIT+;   // again, no type id
> INT                        // type id generated
>    : NUMBER
>     ( (DOT DIGIT) => DOT NUMBER { $type = FLOAT; } )?
>   ;
>
Yes, agreed. I tried a similar syntax early on as the use of tokens { ... }
for
lexer rules seems fairly natural. However, I used the same syntax a little
differently:
tokens { FLOAT; INT; }
fragment DIGIT: '0'..'9';
fragment NUMBER
  : DIGIT+ => INT // making this explicit is good documentation
    ( (DOT DIGIT) => DOT NUMBER { $type = FLOAT; } )?
  ;
This is a little more self-documenting (to my eyes) at the expense of being
a little
more verbose. Using the rule name's type id in the default case is clever
but there
are cases where it would not work so well:
fragment DIRECTIVE:
  '-' (
        'define' => DEFINE
      | 'include' => INCLUDE
      | 'if' => IF
      | ... etc
    );
In this case the '-' alone has no real meaning. So using the parent rules
type id
could be seen as a clever optimization from a certain perspective.
Regards,
Darach.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080216/21c4143b/attachment.html 
    
    
More information about the antlr-interest
mailing list