[antlr-interest] Recognizing interesting productions amidst o
ther text
mzukowski at yci.com
mzukowski at yci.com
Wed Feb 18 15:33:00 PST 2004
Look at the article on antlr.org titled something like "ANTLR meets SED"
Monty
-----Original Message-----
From: semiclueful [mailto:semiclueful at yahoo.com]
Sent: Wednesday, February 11, 2004 4:47 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] Recognizing interesting productions amidst other
text
Hello ANTLR savants,
I have a corpus that consists of natural language text
interspersed with occasional strings representing times (e.g. "12:01
pm", "noon", "1 am") for which I have already built a working grammar
and parser. *I'm not interested in parsing the natural language text*
-- I just want to capture whatever text appears between "time"
instances.
I'd like to figure out a way to get Antlr to consider the text
("phrase") as this *very* loose pseudo-grammar:
phrase:
(time (description)?)+;
time:
numericTime | MIDNIGHT | NOON | other_cases_i'm_leaving_out;
description:
nontime;
As I said, the "time"-related productions, which I've elided, do work,
and I'm hoping I can just drop them into this grammar as-is. However,
I'm unable to come up with a workable definition of "description" or
"nontime" that works with them and doesn't create a mess of lexical
nondeterminism. The ultimate goal is to apply it to a piece of text
like
"8:00 am Woke up, got out of bed. 9am Ran a comb across my head."
and have it produce time/text pairs like
[8:00, "Woke up, got out of bed. "], [9:00, "Ran a comb across my
head."]
But recognizing the non-time text is beyond my laughable Antlr
(in)abilities. I'm guessing that lexical predicates might provide a
way to do this, but I'd appreciate any and all tips you can offer.
Thanks!
Yahoo! Groups Links
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list