[antlr-interest] Re: lexer "modes" for XML parsing etc...

Martin Probst mail at martin-probst.com
Sun Nov 20 05:25:17 PST 2005


Hi,

> Rule 'a' might be able to tell the lexer to get one of A,B,C rather  
> than any token.  I just don't know how lookahead would work in this  
> environment.

I can see two cases:
     1. the Lexing is only controlled by the structure of the text
        stream, e.g. the parser does not call/set anything in the Lexer,
        a classical stateful Lexer.
     2. the Lexing is controlled by the Parser. In this case the Parser
        tells the Lexer that the next token must be of a specific set.
        I've done that and it leads to big problems with lookahead. This
        can probably only be fixed by re-lexing the Tokens each time,
        either generally, or just if the Parser knows it's running into
        different lexing rules.

The second will either give you a big performance hit or be very
complicated to implement in a general way with ANTLR, I guess.

I think the first case can easily be solved in ANTLR, see the other
discussion we had. Support for this in ANTLR would be nice, as it's
really a mess to do that manually if it's more than just a single
boolean flag.

Martin



More information about the antlr-interest mailing list