[antlr-interest] lexer rule matching problem

Martin Probst mail at martin-probst.com
Fri Jan 6 07:07:38 PST 2006


> token { HEX; }
> CONCAT : '&' (( 'h' (HEX_DIGIT)+ (('&')?)! ){ $setType(HEX); })? ;
> protected HEX_DIGIT : '0'..'9' | 'a'..'f' ;

What happens if someone wants do to this:

a = "foo"
h3 = "bar"
b = a&h3

You'll end up with a token stream of IDENTIFIER EQUALS IDENTIFIER HEX.
The lexer needs to know that it's in a non-operator state (where a
concat cannot occur) as the language is ambiguous otherwise. Maybe you
can also get around it by disambiguating in the parser, e.g. lex the '&'
simply as an AMPERSAND and let the parser figure out what it is.

Martin



More information about the antlr-interest mailing list