[antlr-interest] lisp-like issues

Tue Dec 20 14:51:52 PST 2005

Thanks for the response. Here is what I ended up doing for the
lisp-style language. I left all the keywords that immediately followed
opening parens as tokens in the lexer. However, I changed my identifier
in the lexer to look like this: 

IDENTIFIER
	options {testLiterals=false;} //a built-in function name should
still parse okay
	:
		(ALPHA|'&')	a:(options {greedy=true;}:
'_'|ALPHA|DIGIT)*
		{(a == null)? true: (a.getText().length() < 256)}? //
256 max chars
		{if(lastToken == LPAREN) _ttype =
testLiteralsTable(_ttype); 
		lastToken = _ttype;} ;

I declared lastToken in the Lexer header and used it with all my
non-protected lexer types.

That worked beautifully but then I had the issue with tokens that were
not immediately following a paren. I changed them to look like this:

//in the parser header:
	private final String[] directions = { "INOUT", "INPUT" ,
"OUTPUT" };

//in the parser rules:
direction :
	LP! DIRECTION^ i:IDENTIFIER
		 { -1 != Arrays.binarySearch(directions,
i.getText().toUpperCase()) }?
	RP!;

That methodology worked for the rest of the built-in tokens. Is there
some reason that the throw generated by that last piece of code does not
output line and character numbers by default? I think that would be
useful. I also think it should output the getText for all variables used
in it; if I failed to get a match then I definitely want to know what it
was using to compare.