[antlr-interest] HowTo manipulate returned Token value?

Fri May 24 14:31:17 PDT 2002

Antlr doesn't have a notion of returing multiple tokens in one call to
nextToken().  I've gotten by before by creating my own token buffer and
stuffing the extra tokens in there via actions.  Then I subclass my
generated lexer and override nextToken to check the buffer first, only if it
is empty does it call super.nextToken() to prime the buffer.  

It's pretty easy to define TokenFilters, that would be the way to go so you
can reuse the lexer for both parsers.

Monty

> -----Original Message-----
> From: micheal_jor [mailto:open.zone at virgin.net]
> Sent: Friday, May 24, 2002 2:09 PM
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] HowTo manipulate returned Token value?
> 
> 
> Hi All,
> 
> I suspect it should be possible to be able to manipulate the string 
> value associated with a Token before it is returned from a lexer and 
> perhaps insert additional tokens too.  ;-)
> 
> I am trying to deal with the C preprocessor and I wanted my 
> CLangPreprocessorLexer to be able to return tokens for preprocessor 
> directives. 
> 
> Given the following definition,
> 
> PRE_DEFINE
>    : (PRE_WS)* '#' (PRE_WS)* "define" (PRE_WS)+ PRE_IDENT (PRE_WS)+ 
> (PRE_DEFINE_PARAMS)? (PRE_WS)+ PRE_DEFINE_TOKENSTRING NEWLINE
>    ;
> 
> ANTLR returns the whole line - including the NEWLINE char - as the 
> value associated with token PRE_DEFINE. Can I manipulate the textual 
> value associated with the tokens in the Lexer before they are 
> returned?
> 
> Perhaps so I can return:
>    PRE_DEFINE<"">                            then
>    PRE_DEFINE_IDENT<ident-val>               then
>    PRE_DEFINE_PARAMS<param-string>           then
>    PRE_DEFINE_TOKENSTRING<token-string>      then
>    PRE_NEWLINE
> 
> ADDITIONALLY...
> 
> I am working on two Parsers that would share the Lexer -- one that 
> cares about preprocessor stuff and one that doesn't. I can't just 
> ignore all PRE_xxxx tags in the second Parser as it might result in 
> the Parser seeing code that the PRE_xxxx tokens would have flagged as 
> conditionally excluded.
> 
> Can multiple Lexers be arranged as streams of "filters"?. I might be 
> able to code a CLangPreprocessorStripperLexer that feeds on the first?
> 
> Or do I have no choice but to develop two versions of the Lexers or, 
> have both my Parsers be aware of PRE_xxxx tokens?
> 
> Micheal
> 
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/ 
> 
> 
> 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/