[antlr-interest] Real simple grammar - newbie help?!

Gerald Rosenberg gerald at certiv.net
Fri Feb 5 22:22:30 PST 2010


While it may be heresy in the world of context-free grammars, Antlr 
actually performs quite nicely for many NLP problems.

The illustrated approach works well for explicitly identifying a few key 
words in context.  Just have to watch for the lexer functionally being 
k=1 and remember that the lexer rules apply top-down.

There is a filter option if all you want to do is just find keywords.

On 2/5/2010 4:45 PM, James Crowley wrote:
> Hi Michael,
>
> Thanks for the response. Sadly not - the language is English ;-) But just
> hoping to do some basic tokenization of paragraphs of text (essentially just
> extracting keywords) - thought it would be faster/easier to use a tool like
> ANTLR than using regex or attempting to roll my own. Am I being foolish for
> even attempting this?
>
> James
>
> On 5 February 2010 21:29, Michael Matera<mike.matera at xilinx.com>  wrote:
>
>    

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: NaturalLanguage.g
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20100205/749d8794/attachment.pl 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: NaturalLanguage.txt
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20100205/749d8794/attachment.txt 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NLP - Token Stream.png
Type: image/png
Size: 17122 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100205/749d8794/attachment.png 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NLP - AST.png
Type: image/png
Size: 26068 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20100205/749d8794/attachment-0001.png 


More information about the antlr-interest mailing list