[antlr-interest] HowTo manipulate returned Token value?
micheal_jor
open.zone at virgin.net
Fri May 24 14:08:31 PDT 2002
Hi All,
I suspect it should be possible to be able to manipulate the string
value associated with a Token before it is returned from a lexer and
perhaps insert additional tokens too. ;-)
I am trying to deal with the C preprocessor and I wanted my
CLangPreprocessorLexer to be able to return tokens for preprocessor
directives.
Given the following definition,
PRE_DEFINE
: (PRE_WS)* '#' (PRE_WS)* "define" (PRE_WS)+ PRE_IDENT (PRE_WS)+
(PRE_DEFINE_PARAMS)? (PRE_WS)+ PRE_DEFINE_TOKENSTRING NEWLINE
;
ANTLR returns the whole line - including the NEWLINE char - as the
value associated with token PRE_DEFINE. Can I manipulate the textual
value associated with the tokens in the Lexer before they are
returned?
Perhaps so I can return:
PRE_DEFINE<""> then
PRE_DEFINE_IDENT<ident-val> then
PRE_DEFINE_PARAMS<param-string> then
PRE_DEFINE_TOKENSTRING<token-string> then
PRE_NEWLINE
ADDITIONALLY...
I am working on two Parsers that would share the Lexer -- one that
cares about preprocessor stuff and one that doesn't. I can't just
ignore all PRE_xxxx tags in the second Parser as it might result in
the Parser seeing code that the PRE_xxxx tokens would have flagged as
conditionally excluded.
Can multiple Lexers be arranged as streams of "filters"?. I might be
able to code a CLangPreprocessorStripperLexer that feeds on the first?
Or do I have no choice but to develop two versions of the Lexers or,
have both my Parsers be aware of PRE_xxxx tokens?
Micheal
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list