[antlr-interest] Order of token matching

Wed Sep 3 09:00:53 PDT 2008

Hello guys,

I think I have too little understanding of the work of my lexer. I thought
the rules that are specified first are matched first, but in my grammar
this is not the case. 
What I am trying to do is first skipping all comments of my source files,
and then skipping everything between curly braces:

MLCOM	:	'/*'
	;
SLCOM	:	'//'
	;
RCOM	:	'*/'
	;
NL	:	'\r'			{skip();}
	|	'\n'			{skip();}
	;
WS	:	' '			{$channel=HIDDEN;}
	|	'\t'			{skip();}
	;

COMMENT	:	SLCOM (options{greedy=false;}: .)* NL		{skip();}
	|	MLCOM (options{greedy=false;}: .)* RCOM		{skip();}
	;
IMPL	:	'{' (IMPL|'}')* '}'	{skip();}
	;

Rule IMPL matches everything between curly braces, but in between counts
them (by recursively calling itself). 
Now the problem appears if there are braces in comments:

someFunction = function(a,b) {
   // this is one brace too much: {
}

My lexer now sees the opening brace in the comment and searches for the
closing one until the end of file, which results in:
mismatched character '<EOF>' expecting '}'

What I want my lexer to do is first sort out all comments, and second sort
out everything between curly braces. Are there any predicates that could
cause this?

Thanks!