[antlr-interest] Need help on ANTLR for a domain specific NLP project

raj sisodia raj.sisodia at impetus.co.in
Sat Oct 28 02:08:25 PDT 2006


Hi,

 

I need to create a parser (more specifically a translator) on a Business
rule published in telecom domain. I have worked quite a bit on the project
and came to the conclusion of writing a parser for this. But due to lack of
time and the necessary skills on the subject of writing parser and Language
processing I am not able to achieve the desired result. 

 

Now I have a list of around 7000 such rules (a rule is one sentence written
in English, which I need to translate into a code hybrid of  to java and
XML). Since many Rules are similar I expect around 700 different patterns in
rules. Now do you think it is codable using ANTLR (well I know it is, the
only question is how), if yes then would someone please guide me on this or
provide their services (what would be the approx cost and time to do this).
For better understanding I am adding a few rules and their codes in the end
of the mail.

 

Please have a look and let me know your comments. I would be glad to provide
any other information you may require.

 

Thanks

Raj Singh Sisodia

 

Rule 1: LNUM is required

Coding

context :
/Request/lsr_order/lsnp/lsnp_servicedetailscontainer/lsnp_servicedetails[*]/
LNUM

assertion : present()

 

The fields in capital (though it not a compulsory to have field names in
capital for the ease of understanding I am putting them so) are always
p[resent in a DTD. Based on their position in DTD we take out XPaths. The
field for which the rule is coded (generally there is only one) is put n
context. All other fields XPaths are put relative to the XPath in context.

 

Rule 2: The valid format for ECCKT is NN.AAAA.NNNNNN..AA when SC is TX, MO,
KS, OK, or AR.

Coding

context :
/Request/lsr_order/lsnp/lsnp_servicedetailscontainer/lsnp_servicedetails[*]/
ECCKT

assertion : value().hasFormat("NN.AAAA.NNNNNN..AA")

condition : present() && isSWBT()

 

Here value is a class in the framework of the program that would use this
produced code. similarly isSWBT is a "custom function" that would compare SC
value to TX, MO, KS, OK, or AR.

This example shows one problem area too. The value class is stable and no
changes are made to it, but custom functions are added every now and then
and to produce the correct code we need to use them. So the translator
should also have a mechanism to understand and add the custom function and
where to use them.

 

Rule 3: NPI is optional when LNA is N, or when LNA is C and OTN is populated
and REQTYP is E and ACT is C, otherwise prohibited.

Coding

context :
/Request/lsr_order/rs/rs_servicedetailscontainer/rs_servicedetails[*]/NPI

assertion : absent()

condition : req1().equals("E") && act().equals("C") && 

( !value("../LNA").equals("N") && 

  ( !(value("../LNA").equals("C") && 

    present("../OTN") ) 

  ) 

)

 

Both assertion and condition can have multiple functions the combination of
which would return a Boolean value.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061028/a730d8fa/attachment.html 


More information about the antlr-interest mailing list