[antlr-interest] Breaking out of a parser loop based on the current item

Richard Clark rd_clark at sbcglobal.net
Mon Jul 19 20:11:42 PDT 2004


On Jul 19, 2004, at 17:09, Richard Clark wrote:
> 1) Define a word to exclude any ending punctuation (which can cause
> problems with Foo.bar ...), or
> 2) Set up a loop...

Actually, I wound up implementing option 1:

/* in the lexer */
// Four cases:
// non-spaces, non-period, followed by whitespace or the end of the file
// a sentence-ending word followed by a space (or the end  of the file, 
or a tag)
// non-spaces, dot, non-spaces
// non-spaces, open curly bracket w/o a following tag
WORD		:
			WORD_PART
		| 	( WORD_PART PERIOD WORD ) =>  WORD_PART PERIOD WORD
		| 	( WORD_PART RPAREN )	  => WORD_PART RPAREN (WORD)?
		|	( WORD_PART LCURLY ~('@')) => WORD_PART LCURLY
		;

// any run of characters ending in whitespace, newline, period, or 
end-of-file
// (also the right parenthesis, which is fixed by the WORD rule above)
protected
WORD_PART	:	({ LA(0) != EOF_CHAR}? ~(' ' | '\t' | '\f' | '\r' | '\n' | 
'.' | ')' | '{' | '@'))+
			;


/* in the parser, including some trickery so the individual tokens are 
melded into
    one string */
sentence
	{ StringBuffer buf = new StringBuffer(); }
	// make sure there's a word to start the sentence
	:	(WORD) => sentenceFragment[buf]	{ #sentence = #[TEXT, 
buf.toString()]; }
	|	/* nothing */
	;

protected
sentenceFragment[StringBuffer buf]
	:	w:WORD {buf.append(w.getText());}
		(
			whitespace[buf] sentenceFragment[buf]
		|	sentenceEnd[buf]
		)?
	;

protected
sentenceEnd[StringBuffer buf]
	:	PERIOD 	   {buf.append('.'); } (lp: RPAREN {buf.append(')'); })?
	;


/* that's all... */

  ...Richard



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list