[antlr-interest] Natural language parsing

Andy Tripp antlr at jazillian.com
Mon Jan 7 15:01:53 PST 2008


Peter Bruhn Andersen wrote:
>
> I'll soon be starting a project that needs to do quiet a bit of 
> natural language parsing. For that purpose I've tried to find examples 
> of how to use ANTLR but so far I've been out of luck. If any of you 
> know of such a project I would like to get a link to the 
> documentation. A paper with 'do and don't do' advises will be equally 
> welcome.
>
The NLP field has it's own set of tools and a completely different 
approach to parsing than the programming-language-parsing field. Unless 
you have complete control of the input and you can make it a relatively 
trivial grammar, ANTLR and similar tools are the wrong tools to use. By 
"trivial" here, I mean a couple thousand lines. I think you'll never get 
ANTLR (or similar) to parse real-world natural language in any 
meaningful way - that is, create a real AST with NOUN and VERB and 
PREPOSITIONAL_CLAUSE and so on.

I once saw a poster for a NLP conference, and I noticed that among the 
images on the poster was a newspaper with the headline "Woods Eyes 
Masters". Try parsing that sentence without knowing the context - that 
it's a sports headline :) After chewing on that for a while, you'll see 
why the best NLP programs are really based on statistical analysis of 
word frequencies, rather than top-down "parsing".

Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080107/19341b2d/attachment.html 


More information about the antlr-interest mailing list