[antlr-interest] Natural language parsing

Mon Jan 7 15:46:55 PST 2008

But an amazingly large subset can be had with a top-down parser.   
Successful NLP work always tries to apply grammatical structure  
(according to best fit rather than deterministic like top-down LL  
stuff) not just word frequencies.  Even backtracking is required by  
humans when reading sentences; antlr could handle that part too. :)

Ter

On Jan 7, 2008, at 3:01 PM, Andy Tripp wrote:

> Peter Bruhn Andersen wrote:
>>
>> I’ll soon be starting a project that needs to do quiet a bit of  
>> natural language parsing. For that purpose I’ve tried to find  
>> examples of how to use ANTLR but so far I’ve been out of luck. If  
>> any of you know of such a project I would like to get a link to the  
>> documentation. A paper with ‘do and don’t do’ advises will be  
>> equally welcome.
>>
> The NLP field has it's own set of tools and a completely different  
> approach to parsing than the programming-language-parsing field.  
> Unless you have complete control of the input and you can make it a  
> relatively trivial grammar, ANTLR and similar tools are the wrong  
> tools to use. By "trivial" here, I mean a couple thousand lines. I  
> think you'll never get ANTLR (or similar) to parse real-world  
> natural language in any meaningful way - that is, create a real AST  
> with NOUN and VERB and PREPOSITIONAL_CLAUSE and so on.
>
> I once saw a poster for a NLP conference, and I noticed that among  
> the images on the poster was a newspaper with the headline "Woods  
> Eyes Masters". Try parsing that sentence without knowing the context  
> - that it's a sports headline :) After chewing on that for a while,  
> you'll see why the best NLP programs are really based on statistical  
> analysis of word frequencies, rather than top-down "parsing".
>
> Andy