[antlr-interest] Grammar natural language

Gerald Rosenberg gerald at certiv.net
Fri Oct 15 09:09:06 PDT 2010


  I agree with Steve that a small structured language is probably best.

However, if natural language input is a requirement and you can tolerate 
some degree of inexactness, you can use the OpenNLP (sourceforge) 
package to:

1) do sentence detection (unless you can guarantee that every statement 
is bounded by a hard line end).
2) do part of speech tagging to augment the words of the sentence.
3) do word grouping to identify related word relations and further 
augment the contents of the sentence.

You will also need to:
4) develop tools to build a corpus of examples to train the models 
underlying 1-3.
5) develop an Antlr grammar and set of tree walkers to analyze and 
extract usable information from a fully augmented sentence.

Your initial OpenNLP models will likely be about 70% accurate.  With a 
lot of training and tuning, and dependent on the size of the domain, you 
can push it up to about 95-98% accuracy.

Doing NLP solely in Antlr is a practical impossibility.  With OpenNLP as 
a front end, Antlr is well suited for NLP.  Just don't do it unless NL 
is a requirement.

Best,
Gerald


------ Original Message (Friday, October 15, 2010 1:24:53 
PM) From: Stephen Winnall ------
Subject: Re: [antlr-interest] Grammar natural language
> Hi Dagi
>
> Grammars for natural languages are very difficult, and ANTLR is not suited for the general case. Natural languages are a complex structure involving the interaction of phonemics, morphology, syntax and semantics (not to mention general knowledge). Classic illustrations of the sort of problems that can arise are sentences like "flying planes can be dangerous" or "general flies back to front".
>
> However, if you can restrict your corpus to a relatively small, well-defined domain (runways?), you may still be able to create an adequate grammar. But the chances that anyone has already written a grammar for that domain are correspondingly small. And your users are going to have to learn to restrict their language to what the grammar can handle, so you might really be better off writing a simple DSL instead.
>
> Steve
>
> On 15 Oct 2010, at 10:29,<Dagi.Troegner at dlr.de>  wrote:
>
>> Hi Armin,
>>
>> I would like to cover just basic sentences in the English language with the purpose to let a user formulate simple constraints for a modelling environment. For the beginning sentences like
>>
>> "The length of a runway is not greater than 5000 metres"
>> Or
>> "If the runway is dependent then the distance is smaller than 1000 metres"
>>
>> Thanks for any advice,
>>
>> Dagi
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Armin.Wegner at bka.bund.de [mailto:Armin.Wegner at bka.bund.de]
>> Gesendet: Freitag, 15. Oktober 2010 07:47
>> An: Trögner, Dagi
>> Betreff: AW: [antlr-interest] Grammar natural language
>>
>> Hi Dagi,
>>
>> for which one? Most likely you will have a separate grammar for each natural language.
>>
>> Armin
>>
>> -----Ursprüngliche Nachricht-----
>> Von: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] Im Auftrag von Dagi.Troegner at dlr.de
>> Gesendet: Donnerstag, 14. Oktober 2010 14:39
>> An: antlr-interest at antlr.org
>> Betreff: [antlr-interest] Grammar natural language
>>
>> Hi everyone,
>>
>> I am looking for a simple grammar for natural language. In a first version just short simple sentences would be satisfying.
>> Has anyone tried to formulated such a grammar already?
>>
>> Thanks a lot,
>>
>> Dagi
>>
>>
>> ********************************************************
>>
>>
>>
>> Dagi Troegner
>>
>> Deutsches Zentrum fuer Luft- und Raumfahrt (DLR)
>>
>> Institut fuer Flugfuehrung
>>
>> Abteilung Lotsenassistenz
>>
>> Lilienthalplatz 7
>>
>> D-38108  Braunschweig
>>
>> Telefon: (49) 531 / 295-2189
>>
>> Fax: (49) 531 / 295-2180
>>
>> Email: Dagi.Troegner at dlr.de<mailto:Dagi.Troegner at dlr.de>
>>
>> __/|__
>>
>> /_/_/_/
>>
>>    |/ DLR
>>
>> ********************************************************
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>> -- 
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>


-- 

Gerald B. Rosenberg, Esq.
NewTechLaw
260 Sheridan Ave., Suite 208
Palo Alto, CA 94306-2009
650.325.2100 (office) / 650.703.1724 (cell)
650.325.2107 (facsimile)

www.newtechlaw.com

CONFIDENTIALITY NOTICE: This email message (including any attachments) 
is being sent by an attorney,
is for the sole use of the intended recipient, and may contain 
confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited. 
If you are not the intended
recipient, please contact the sender immediately by reply email and 
delete all copies of this message
and any attachments without retaining a copy.


More information about the antlr-interest mailing list