[antlr-interest] Lucene grammar
Gavin Lambert
antlr at mirality.co.nz
Sun Jan 18 23:08:24 PST 2009
At 09:59 19/01/2009, Dennis Brothers wrote:
>A specific problem I see is that Lucene involves queries of the
>form foo:bar (and there might be whitespace either side of the
>colon).
[...]
>I'd like to distinquish field names from words in the lexer,
>but I don't see a simple way to do it. Can I somehow use a
>syntactic predicate in the lexer? Or a semantic predicate
>that scans ahead for the colon?
Yes, you *could* do that, but it's not really a good idea.
Don't try to do too much work in the lexer -- just get it to
consolidate groups of letters/numbers/etc into generic IDs or
WORDs or whatever and then figure out what they mean in the
parser.
If you're generating an AST, you can change the type of the token
in the output AST once you know more about the context in which it
is used.
More information about the antlr-interest
mailing list