[antlr-interest] token precedence (and an ANTLRworks question)
Davyd Madeley
davyd at fugro-fsi.com.au
Mon Nov 17 00:54:11 PST 2008
I am having what looks like a problem with rule precedence.
I have lexer rules that look as following:
IDENTIFIER
: CHAR (CHAR | DIGIT)*
;
TOKEN
: ~(NEWLINE|','|'>')+
;
NEWLINE
: '\n' // Line feed
| '\r' // Carriage return
| '\u2028' // Line separator
| '\u2029' // Paragraph separator
;
fragment
CHAR
: 'A' .. 'Z'
| 'a' .. 'z'
;
fragment
DIGIT
: '0' .. '9'
;
I'm trying to use it to parse the following text ('**' and '/' appear in
the parser rules):
LINE,1500,4,60,60
**INPUT/NOSICHECK
Into a token stream:
|LINE|,|1500|,|4|,|60|,|60|
|**|INPUT|/|NOSICHECK|
But instead what I'm ending up with is:
|LINE|,|1500|,|4|,|60|,|60|
|**INPUT/NOSICHECK|
This suggests to me that it's wrong of me to assume that the first rule
will be matched first. I can't find much discussion of precedence rules
in the ANTLR book.
Also, the ANTLRworks debugger can show you the token stream with little
red boxes around each token, but I can't seem to work out how to find
out the token type for that token, is there something I'm missing here?
Thanks in advance,
--davyd
--
Davyd Madeley Software Engineer
Fugro Seismic Imaging, Perth Australia
More information about the antlr-interest
mailing list