[antlr-interest] Breaking out of a parser loop based on the current item

Richard Clark rd_clark at sbcglobal.net
Mon Jul 19 17:09:18 PDT 2004


I'm working on a Javadoc-like compiler for Javascript and ActionScript 
(and yes, I'll share when done, including an ECMA-262 grammar); but 
I've encountered an interesting problem in the Javadoc side.

I need to identify a sentence. At the moment, I have WORD defined in 
the lexer as "a contiguous run of non-whitespace characters", so that:
This is a test.
would come back as <WORD><SPACE><WORD><SPACE>...

The problem is that I need to be able to find the end of one sentence, 
and I have two options:
1) Define a word to exclude any ending punctuation (which can cause 
problems with Foo.bar ...), or
2) Set up a loop that says:

  ( WORD
    ( if that word ends a sentence, or the next item is an @tag ) => { 
break }
    | SPACE*) *

I would prefer option 2, but I haven't been able to figure out how to 
end the loop in retrospect.

I guess a third option could be:

sentence :
	( isEndOfSentence(word) ) => word
   |	word SPACE sentence
   ;

What do you all think is best?

  ...Richard



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list