[antlr-interest] C++ TokenStreamSelector

John Reid j.reid at mail.cryst.bbk.ac.uk
Thu Feb 15 06:16:23 PST 2007


Ric Klaren wrote:
> Hi,
> 
> On 2/15/07, John Reid 
> <j.reid at mail.cryst.bbk.ac.uk> wrote:
> 
>> I'm attaching my parser to a TokenStreamSelector. In some of my parser
>> rules I call push on the selector to switch between lexers.
> 
> 
> This does not work in the general case (due to lookahead,
> tokenbuffering and presence of syntactic predicates).
> 
>> What is the recommended way to flush this buffer and force re-lexing of
>> the input stream?
> 
> 
> There is no such mechanism. You might get something to work with very
> creative use of mark, rewind on the buffer and adding code to
> invalidate/reset the state of the lookahead. But this requires a
> *very* *very* good understanding of your parser and how it parses.
> E.g. you have to mark the input at the start of a rule if you suspect
> that a switch might be necessary and rewind and cleanup if it fails.
> Or unregister the mark if it was not needed (e.g. no switch needed)
> (in short: a maintenance nightmare)
The token stream must know what input has been consumed and what is 
pending. I can't see why it could not re-lex the pending input but I 
have to admit I don't understand the antlr internals: so I'll take your 
word for it.

> 
> I would not go tread way unless I *really* had no other option. E.g.
> more passes, uses AST's.. maybe use tokenstream rewriting. It depends
> on what you want to accomplish.
> 
My parsing problem is that sometimes fields in my text file are 
delimited by '.', ':', ';', and various other tokens. My problem is that 
in many cases these characters are part of the values of the fields and 
in other cases they are delimiters. I can only know which is which at 
parse time. So I thought what I was doing was the natural solution. 
Obviously I just misinterpreted the documentation!

Does anyone have any advice for how to approach this problem? None of 
the examples in the antlr documentation deal with this sort of grammar.

Thanks,
John.



More information about the antlr-interest mailing list