[antlr-interest] Inconsistent Parse Results

Wed Jun 3 07:13:52 PDT 2009

When parsing the following data 
"2 CORP The Church of Jesus Christ of Latter-day Saints"

The parser is choking on Ch? and striping it out.

line 1:12 mismatched character 'h' expecting 'O'
line 1:28 mismatched character 'h' expecting 'O'
Data: Theurch of Jesusrist of Latter-day Saints

I am new to antlr, is my grammer wrong, or is it a bug?

Grammer -

grammar Test1 ;

test1 : NUMBER ' CORP ' data {System.out.println("Data: " +
$data.text);} ;

data : ~('\r' | '\n')* ;

NUMBER : '0'..'9'+ ;

OTHERCHAR : 
	'~' | 
	'!' | 
	'@' | 
	'#' | 
	'$' | 
	'%' | 
	'^' | 
	'&' | 
	'*' | 
	'(' | 
	')' | 
	'-' | 
	'_' | 
	'+' | 
	'=' | 
	'{' | 
	'}' | 
	'[' | 
	']' | 
	':' | 
	';' | 
	'<' | 
	'>' | 
	'?' | 
	',' | 
	'.' | 
	'/' | 
	' ' ;

WORD : ('a'..'z' | 'A'..'Z')+ ;

Test App -

import java.io.IOException;
import org.antlr.runtime.ANTLRFileStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;

public class TestApp
{
	public static void main(String[] inArgList)
	{
		try
		{
			ANTLRFileStream theFileStream = new
ANTLRFileStream("/home/glenmiller/tmp1/output/TestData2");
			Test1Lexer theLexer = new Test1Lexer(theFileStream);
			CommonTokenStream theTokenStream = new CommonTokenStream(theLexer);
			Test1Parser theParser = new Test1Parser(theTokenStream);
			theParser.test1();

		}
		catch (IOException inException)
		{
			inException.printStackTrace();
		}
		catch (RecognitionException inException)
		{
			inException.printStackTrace();
		}
	}
}