[antlr-interest] Lexer Predicates?
Johannes Luber
jaluber at gmx.de
Sun Aug 3 13:24:46 PDT 2008
Matt Palmer schrieb:
> I am having similar issues - I'm having to encode the parser state into
> the lexer. This is because I have character sequences that are subsets
> of one another, etc. that should only match in certain places. The
> high level parser rules determine this very nicely - but I have to
> explicitly push down those rules into the lexer using predicates.
>
> I was wondering myself if it would be possible to automate this process.
>
> Matt.
There is an XQuery project which has a lexer which only creates token on
demand. I don't the project page, but searching the archives should give
you clues.
Johannes
>
> On Sun, Aug 3, 2008 at 7:34 PM, Foust <javafoust at gmail.com
> <mailto:javafoust at gmail.com>> wrote:
>
> > At 9:40pm, August 02, 2008 Gavin wrote:
> >
> > At 11:06 3/08/2008, Foust wrote:
> > >Do lexer predicates work in v3?
> >
> > That depends on what you mean. You can certainly use both
> > syntactic and semantic predicates within the lexer, but they can
> > only use lexer state.
>
> That would explain why setting a static flag in the Lexer from the
> Parser
> has no effect -- the Lexer has already run to completion before the
> parser
> receives the first token.
>
> >
> > Also, while I'm not entirely sure about this, I think predicates
> > in the lexer can only be used to decide between alts within a
> > single lexer rule. I vaguely recall some trouble when trying to
> > use them to decide between multiple lexer rules (at the top
> > level).
>
> I'll keep that in mind. I've had nothing but trouble trying to get
> the Lexer
> to return tokens based on context (as best determined by the Parser).
>
>
> > Generally speaking, you should keep your lexer fairly
> > straightforward and unambiguous, and defer semantic decisions (and
> > ambiguity resolution) until the parsing phase.
>
> Yes... it started out that way. But to allow spaces to be part of a
> config
> value (read up to EOL), the Lexer needs to honor state. (Place
> spaces in the
> HIDDEN channel for all other cases - outside of a special
> config/preprocessor rule).
>
> The parser shouldn't have to go through contortions because of lexer
> design.
> In fact, it seems as though the lexer itself is fine, if only it
> would get
> tokens as required, rather than all at once.
>
> Might I suggest the lexer be endowed with a state mechanism that can be
> controlled from the parser?
>
>
>
More information about the antlr-interest
mailing list