[antlr-interest] Breaking out of a parser loop based on the current item
Richard Clark
rd_clark at sbcglobal.net
Mon Jul 19 17:09:18 PDT 2004
I'm working on a Javadoc-like compiler for Javascript and ActionScript
(and yes, I'll share when done, including an ECMA-262 grammar); but
I've encountered an interesting problem in the Javadoc side.
I need to identify a sentence. At the moment, I have WORD defined in
the lexer as "a contiguous run of non-whitespace characters", so that:
This is a test.
would come back as <WORD><SPACE><WORD><SPACE>...
The problem is that I need to be able to find the end of one sentence,
and I have two options:
1) Define a word to exclude any ending punctuation (which can cause
problems with Foo.bar ...), or
2) Set up a loop that says:
( WORD
( if that word ends a sentence, or the next item is an @tag ) => {
break }
| SPACE*) *
I would prefer option 2, but I haven't been able to figure out how to
end the loop in retrospect.
I guess a third option could be:
sentence :
( isEndOfSentence(word) ) => word
| word SPACE sentence
;
What do you all think is best?
...Richard
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list