[antlr-interest] MismatchedTokenException and how to find errors in ANTLRWorks

Jim Idle jimi at temporal-wave.com
Tue Feb 12 10:35:55 PST 2008


I strongly believe that using literal strings in the grammar is fraught 
with difficulties for the beginner and advise against it. Take the 
string literals out of the grammar and create your own tokens, then you 
will see the ambiguities (or otherwise) between your tokens and so on. 
You are probably better off with something like this (guessing at the 
card sequence etc):

grammar cards;

dealingRiver
	: STARSTAR RIVER STARSTAR COLON LBRACKET c=ID RBRACKET NL // Only 
keep the NL unhidden if the grammar is ambiguous otherwise
		{
			String card = $c.text;
			char suit = card.charAt(1);
			char cardinalty = card.charAt(0);
			
			if 	(card.length() !=2 || !(suit == 'd' || suit == 
'h' || suit == 'c' || suit == 's')) // And so on
			{
				System.out.println("Invalid card '" + card + "'");
			}
		}
	;

// Leadins and keywords
//
RIVER	: 'Dealing River' ;

// Punctuation
//
LBRACKET	:	'['	;
RBRACKET	:	']'	;
COLON		:	':'	;
STARSTAR	:	'**';

// ID
//
ID	: ('a'..'z'|'A'..'Z'|'_'|'0'..'9')+ 
	;

// NEWLINE
//
NL	: '\r'? '\n'	
	;

// Whitespace
//
WS	: (' ' | '\t')+
	{
		$channel = HIDDEN;
	}
	;

ANY	: .	{ $channel = HIDDEN; System.out.println("Unknown char '" 
+ $text + "' on line " + $line);	}	;

Note that you should match anything that is vaguely OK in the lexer, 
then check it semantically later. This allows you better errors than 
"Mismatched char...". You really dont want your lexer issuing any 
ANTLR generated errors if you can help it.

Jim

> -----Original Message-----
> From: Micke Hovmöller [mailto:micke.hovmoller at gmail.com]
> Sent: Tuesday, February 12, 2008 10:06 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] MismatchedTokenException and how to find
> errors in ANTLRWorks
> 
> I have a bit of a problem finding the errors in my grammar, and would
> appreciate some hints on how to go about finding them.
> 
> I have this rule in my grammar:
> dealingriver
> 	:	'** Dealing River ** :  [ ' CARD ']' NEWLINE ;
> 
> and this input:
> ** Dealing River ** :  [ 3d ]
> 
> (All this part of larger grammar and input.)
> 
> ANTLRWorks finds the dealingriver rule, splits it into '** Dealing
> River ** :  [ ' and 3d but gives this message as the third leaf in the
> node:
>   MismatchedTokenException(5!=31)
> 
> This leads to my questions:
> 1. Is there a more extensive list of error messages than the fairly
> short one in the ANTLR reference book?
> 
> 2. What is the easiest way to find which tokens are referred to by, in
> this case, 5 and 31? If I generate the grammer, I can look in the
> .token file, but that seems tedious. Can't I find that inside
> ANTLRWorks somewhere? (FWIW, this is what I found in the .token file:
> '** Dealing Turn ** :  ['=31
> ID=5
> 
> ID is defined as:
> ID: ('a'..'z'|'A'..'Z')('a'..'z'|'A'..'Z'|'_'|'0'..'9')+ ; )
> 
> 
> 3. I get this error a lot, it seems. How should I think/what should I
> look for in debugging?
> 
> (4. Is it obvious what is wrong here? I expect to have found the issue
> shortly, but I'm still interested in the general questions above.)
> 
> /Micke




More information about the antlr-interest mailing list