[antlr-interest] bad matching in grammar

Alex Shneyderman a.shneyderman at gmail.com
Sun Aug 5 12:25:25 PDT 2007


There is something messy about the lexer (I find it the hardest part
of ANTLR to comprehend). A quick way to see this is
to run your input through the lexer only:

package org.chama.builder.model.antlr;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.RecognitionException;
import org.antlr.runtime.Token;

public class Tester {

	private static Map tokens = new HashMap();
	
	static {
		tokens.put(new Integer(8), "TYPE");
		tokens.put(new Integer(6), "QIDStar");
		tokens.put(new Integer(11), "INT");
		tokens.put(new Integer(9), "ARG");
		tokens.put(new Integer(4), "WS");
		tokens.put(new Integer(10), "QID");
		tokens.put(new Integer(5), "NEWLINE");
		tokens.put(new Integer(7), "ID");
		tokens.put(new Integer(18), ",");
		tokens.put(new Integer(17), "(");
		tokens.put(new Integer(19), ")");
		tokens.put(new Integer(12), "package");
		tokens.put(new Integer(13), "imports");
		tokens.put(new Integer(16), "model");
		tokens.put(new Integer(15), "}");
		tokens.put(new Integer(14), "{");
	}
	
	public static void main(String[] args) throws RecognitionException,
IOException {
		ModelLexer lexer = new ModelLexer (new ANTLRStringStream (
"package org.chama.test.models\n" +
"\n" +
"model Band {\n" +
"\n" +
"}\n" +
""
		));
		
		Token token = lexer.nextToken();
		while(token.getType() != Token.EOF) {
			System.out.println("Token:'" + token.getText() + "' : " +
tokens.get(new Integer(token.getType())));
			token = lexer.nextToken();
		}
	}
	
}

you will see the output will be something like so:
Token:'package org' : ARG
Token:'chama.test.models' : QID
Token:'
' : NEWLINE
Token:'
' : NEWLINE
Token:'model Band' : ARG
Token:'{' : {
Token:'
' : NEWLINE
Token:'
' : NEWLINE
Token:'}' : }
Token:'
' : NEWLINE

As you can see ARG token is messing things up a bit. Why don't you set
the lexer to ignore WS and NEWLINE's then grammar will be cleaner.
(Unless in your grammar these are significant). Also, QID and QIDStar
are in conflict - QIDStar can match exactly what QID would, so you
have to deal with this somehow.

HTH,
Alex.

On 8/5/07, Warner Onstine <warnero at gmail.com> wrote:
> Ok, I've done some reworking and am running into some issues with my
> grammar and I can't seem to figure out what I'm doing wrong. Here are
> the current errors I'm getting with the attached grammar and test
> file.
>
> line 1:11 no viable alternative at character '.'
> line 3:10 no viable alternative at character ' '
> line 1:0 mismatched input 'package org' expecting 'package'
> line 3:0 mismatched input 'model Band' expecting 'model'
>
> Thanks for all the help in understanding what in the world I'm doing wrong ;-).
>
> -warner


More information about the antlr-interest mailing list