[antlr-interest] Re: lexer "modes" for XML parsing etc...
Martin Probst
mail at martin-probst.com
Sun Nov 20 05:25:17 PST 2005
Hi,
> Rule 'a' might be able to tell the lexer to get one of A,B,C rather
> than any token. I just don't know how lookahead would work in this
> environment.
I can see two cases:
1. the Lexing is only controlled by the structure of the text
stream, e.g. the parser does not call/set anything in the Lexer,
a classical stateful Lexer.
2. the Lexing is controlled by the Parser. In this case the Parser
tells the Lexer that the next token must be of a specific set.
I've done that and it leads to big problems with lookahead. This
can probably only be fixed by re-lexing the Tokens each time,
either generally, or just if the Parser knows it's running into
different lexing rules.
The second will either give you a big performance hit or be very
complicated to implement in a general way with ANTLR, I guess.
I think the first case can easily be solved in ANTLR, see the other
discussion we had. Support for this in ANTLR would be nice, as it's
really a mess to do that manually if it's more than just a single
boolean flag.
Martin
More information about the antlr-interest
mailing list