[antlr-interest] lexer precedence
David Weiser
davidann at gmail.com
Thu Dec 9 11:41:54 PST 2010
Howdy,
I have a lexer which has the following rules (I'm modding the XML
Parser from http://www.antlr.org/wiki/display/ANTLR3/1.+Lexer ):
WS : {tagMode}?=> (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;}
;
PCDATA : { !tagMode }?=> (~'<')+
;
The problem I have is that the lexer ends up tokenizing sequences like
"\n\n\n\n" as PCDATA instead of WS.
It's apparent that there is a nondeterminism between WS and PCDATA
since '\n' matches both '\n' and '~<'. How can I get around this?
--
Thanks,
dw
--
Thanks,
dw
More information about the antlr-interest
mailing list