[antlr-interest] Lexer problem: distinguish between TIC and CHAR_LITERAL

Martin Probst mail at martin-probst.com
Tue Sep 13 06:44:56 PDT 2005


Hi,

> The character literal is defined as two tics with one character in
> between.
> To decide if a tic is a TIC or the beginning of a CHARACTER_LITERAL,
> we check if at LA(3) follows another tic or not.

If this is the definition of your language, then this is (IMHO) just a
case of the language being ambiguous. How should the parser/lexer tell
what the user wants? In this case it's quite obvious (one case giving an
error, another one not), but I think that you might find places where
it's not that easy.
> 
> new String'('b' & Second_Char);

If there is a certain restriction on where tics may occur and where they
are disallowed, then you can probably write a stateful lexer working
around the problem. But the much nicer solution would of course be a
more sound definition of the lexical structure, avoiding these
ambiguities.

Martin
> 



More information about the antlr-interest mailing list