[antlr-interest] natural-language parsing problem: How to distinguish between special words and regular words
Sven Prevrhal
sven.prevrhal at ucsf.edu
Tue Jan 27 17:11:29 PST 2009
I want to parse recipes. How can I distinguish (for instance) between a
measuring unit such as "cups" and other general words?
If I do
WORD:
LETTER+;
UNIT:
"cups";
the lexer will emit WORD for "cups" as well at least that's what I see
happening. I tried
WORD:
u=UNIT {
$u.setType(UNIT);
emit($u);
} | LETTER+;
but that causes an error saying that UNIT can never be matched.
If I place the burden on the parser say as
unit:
w=WORD
{
if ($w == "cups") return $w;
}
;
and the WORD token is actually not a unit I have lost the token to the
parser. Should I / How can I place that nonmatch token back into the token
stream? Or what's the solution to that??
Thanks a lot - Sven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090127/4ab61f54/attachment.html
More information about the antlr-interest
mailing list