[antlr-interest] Lucene grammar
Dennis Brothers
brothers at bros.com
Sun Jan 18 12:59:31 PST 2009
I'm in the process of creating a parser for Lucene query syntax. I've
done a lot of searching, and can't find any useful "prior art". Is
anyone aware of a Lucene parser built with ANTLR?
A specific problem I see is that Lucene involves queries of the form
foo:bar (and there might be whitespace either side of the colon).
This means "find a record whose foo field contains the word 'bar'".
To complicate things further, the field name and colon are optional -
there's a default field.
I'd like to distinquish field names from words in the lexer, but I
don't see a simple way to do it. Can I somehow use a syntactic
predicate in the lexer? Or a semantic predicate that scans ahead for
the colon? In either case, how do I deal with the optional whitespace
in the lexer? Would the traditional whitespace-skipping constructs
take effect before the predicate was tested?
Thanks for any insight -
- Dennis Brothers
More information about the antlr-interest
mailing list