[antlr-interest] Token stream filter

Thu Jun 3 01:58:27 PDT 2004

> If you write the code in nextToken to do that it will. The ! operator
> controls treebuilding it's not the lexer's ! operator. At least I was
under
> the impression you wanted to use a parser to do the filtering not a
lexer
> in front of your original lexer. 

Bugger :-(

Yes, I did want to use a parser between my original lexer and parser. Or
can I put a lexer there instead? Basically, I don't care whether it's a
lexer or parser, I just want to sit it between my primary lexer and
parser to strip out stuff I don't want and/or modify stuff I do.

Can I lex a token stream as well as a character stream? And if so, will
the second lexer see hidden tokens (I presume not).

The trouble is (hint to Ter for the manual :-) that there's a chapter on
lexing, and a chapter on treeparsing, but nothing on parsing. And the
stuff on token streams implies substituting different lexers for
different things. I want to process the data in multiple passes, not
change to a different lexer.

Cheers,
Wol

-----Original Message-----
From: Ric Klaren [mailto:klaren at cs.utwente.nl] 
Sent: 03 June 2004 09:44
To: antlr-interest at yahoogroups.com
Subject: Re: [antlr-interest] Token stream filter

On Thu, Jun 03, 2004 at 09:24:07AM +0100, Anthony Youngman wrote:
> Thanks. Actually, Monty's solution should work ...

It looks a lot simpler ;)

> but seeing as you seem to know these things, taking this line from my
> original post

I'm only theorizing ;)

> 	(id:IDENT {if text != "REM" throw tokenmatchexception}|"*"|"!")
>
> which is the exception I need to throw here?

If you're trying to make the rule work inside a ( )=>( ) construct then
it
should be something RecognitionException like (or derived of it)

> So - I can feed the lexer output into my deremer parser - and I can
then
> feed the output from that into my main parser?

If you follow Monty's framework you should be ok I guess.

> And if I have a rule like
>
> commentst : (EOL | SEMI) ("*" | "!")! (~(EOL)*)! ;
>
> it will then eat everything between the initial eol/semi and final
eol,
> but it will let those two tokens through to the next parser?

If you write the code in nextToken to do that it will. The ! operator
controls treebuilding it's not the lexer's ! operator. At least I was
under
the impression you wanted to use a parser to do the filtering not a
lexer
in front of your original lexer.

Cheers,

Ric
--
-----+++++*****************************************************+++++++++
-------
    ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893755  ----
-----+++++*****************************************************+++++++++
-------
 Time what is time - I wish I knew how to tell You why - It hurts to
know -
          Aren't we machines - Time what is time - Unlock the door
               - And see the truth - Then time is time again
                From: 'Time what is Time' by Blind Guardian

Yahoo! Groups Links

****************************************************************************

This transmission is intended for the named recipient only. It may contain private and confidential information. If this has come to you in error you must not act on anything disclosed in it, nor must you copy it, modify it, disseminate it in any way, or show it to anyone. Please e-mail the sender to inform us of the transmission error or telephone ECA International immediately and delete the e-mail from your information system.

Telephone numbers for ECA International offices are: Sydney +61 (0)2 8272 5300, Hong Kong + 852 2121 2388, London +44 (0)20 7351 5000 and New York +1 212 582 2333.

****************************************************************************

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/