[antlr-interest] TokenStreamRewriteEngine question
Scott Amort
jsamort at rogers.com
Sun Mar 12 08:51:26 PST 2006
Hi All,
I am using a TokenStreamRewriteEngine to discard unwanted whitespace and
comments, while still retaining the original file contents for debug and
error messages. However, I have noticed that within my lexer, I
'prediscard' a number of other characters, such as double-quotes,
backslashes, etc. These latter types are necessary to define certain
tokens, but I don't want them actually passed on to the parser, so I
have lexer defines like:
TAG
: '\\'! IDENT
;
Where IDENT is an alphanumeric identifier. What I have noticed,
however, is that the backslash character never makes it to the rewrite
engine, and so, is missing from the output of originalToStream.
A possible solution to this is to not have my lexer do as much
'parsing', and just be concerned with more basic token types, but once I
do that I get a wide variety of non-determinism errors. There are
actually only three characters that I discard in the lexer - the equals
sign, double-quotes and the backslash. Is there an easier way to have
these included, or do I have to redesign my lexer? Thanks!
Best,
Scott
More information about the antlr-interest
mailing list