[antlr-interest] Tokens and literals: how to avoid conflics?

Gioele Barabucci barabucc at cs.unibo.it
Tue Jul 15 08:09:05 PDT 2008


Jim Idle wrote:
> I really , once again, cannot stress too much the fact that new users
> should not use the inline 'quote' rules in the parser. They really send
> you down the wrong streets until you are completely familiar with the
> parser/lexer process. I look at your grammar and see the obvious
> problems, but I just don't see how new users would.
Could you please point me to guides or tutorials about the ANTLR lexer and
the correct "style" I should use to write token rules? I could not find
anything on the net.

> ID    : 'ID' ;
> IDENT : ('a'..'z' | 'A'..'Z')+ ;
> HASH  : '#'   // Many things prefix with HASH, differentiate them here
>                (  (FIX)=>FIX  { $type = FIX; }
>                   | (IMP)=>IMP {$type = IMP; }
>                   | // Neither keyword, sometimes HASH is just HASH and
> not pounds
>                )
>             ;
> 
> Now, in the parser use teh token names:
> 
> stmt: ID S idName S (IMP|FIX) EOF ;
> idName : HASH IDENT;

Thank you for this solution: I'll use in many similar cases I have in my
grammar.

Sadly, this solution solves the problem only where there is a precise char
that one can use to discriminate. What about this example where the text of
a keyword can be used in other rules:

stmt: ID S simple_name S ('#IMP'|'#FIX') EOF;

simple_name: NAME;
ID: 'id';
NAME: ('a'..'z')+
S: (' '|'\n')+

This grammar will recognise 'id ix #FIX' but will fail on 'id id #FIX' with
the usual MismatchedTokenException. They keyword 'id' cannot be recognized
as a NAME token.

Is there a way to tell ANTLR "look for the characters 'id' only when in the
ID token, in all the other cases classify it as NAME (or whatever fits
it)"?

This happens quite often in my grammar (obviously this is just a simple
test-case for my problems): I have many keywords that lose their special
meaning once they are not in a certain position.

-- 
Gioele Barabucci <barabucc at cs.unibo.it>



More information about the antlr-interest mailing list