[antlr-interest] Lucene grammar
Terence Parr
parrt at cs.usfca.edu
Sun Jan 18 13:01:42 PST 2009
Hi. you should be able to almost directly translate the javacc version
of their career language grammar to ANTLR's format. That said, the
previous query parser was terrible. You'll probably have to pass all
words to the parser and let it figure out what to do or assume
everything is just a word except for "ID:" type stuff. return as a
special token.
Ter
On Jan 18, 2009, at 12:59 PM, Dennis Brothers wrote:
> I'm in the process of creating a parser for Lucene query syntax. I've
> done a lot of searching, and can't find any useful "prior art". Is
> anyone aware of a Lucene parser built with ANTLR?
>
> A specific problem I see is that Lucene involves queries of the form
> foo:bar (and there might be whitespace either side of the colon).
> This means "find a record whose foo field contains the word 'bar'".
> To complicate things further, the field name and colon are optional -
> there's a default field.
>
> I'd like to distinquish field names from words in the lexer, but I
> don't see a simple way to do it. Can I somehow use a syntactic
> predicate in the lexer? Or a semantic predicate that scans ahead for
> the colon? In either case, how do I deal with the optional whitespace
> in the lexer? Would the traditional whitespace-skipping constructs
> take effect before the predicate was tested?
>
> Thanks for any insight -
> - Dennis Brothers
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list