[antlr-interest] Inconsistent Parse Results

Wed Jun 3 08:13:58 PDT 2009

That is an expected behavior. Seeing ' C' the lexer decides to go for 'CORP'
token instead of OTHER(space) and WORD. You need to do some left factoring
there. Or you can modify your grammar to avoid such problems. Here is a
suggested correction:

grammar Test ;

test1 : NUMBER CORP data {System.out.println("Data: " + $data.text);} ;

data : ~('\r' | '\n')* ;

NUMBER : '0'..'9'+ ;

CORP:	'CORP' ;

WORD : ('a'..'z' | 'A'..'Z')+ ;

WS	:	(' ' | '\t') {$channel=HIDDEN;}
	;

OTHERCHAR
	:	.
	;

Cheers, Indhu 

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Glen Miller
Sent: Wednesday, June 03, 2009 7:44 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Inconsistent Parse Results

When parsing the following data 
"2 CORP The Church of Jesus Christ of Latter-day Saints"

The parser is choking on Ch? and striping it out.

line 1:12 mismatched character 'h' expecting 'O'
line 1:28 mismatched character 'h' expecting 'O'
Data: Theurch of Jesusrist of Latter-day Saints

I am new to antlr, is my grammer wrong, or is it a bug?

Grammer -

grammar Test1 ;

test1 : NUMBER ' CORP ' data {System.out.println("Data: " +
$data.text);} ;

data : ~('\r' | '\n')* ;

NUMBER : '0'..'9'+ ;

OTHERCHAR : 
	'~' | 
	'!' | 
	'@' | 
	'#' | 
	'$' | 
	'%' | 
	'^' | 
	'&' | 
	'*' | 
	'(' | 
	')' | 
	'-' | 
	'_' | 
	'+' | 
	'=' | 
	'{' | 
	'}' | 
	'[' | 
	']' | 
	':' | 
	';' | 
	'<' | 
	'>' | 
	'?' | 
	',' | 
	'.' | 
	'/' | 
	' ' ;

WORD : ('a'..'z' | 'A'..'Z')+ ;

Test App -

import java.io.IOException;
import org.antlr.runtime.ANTLRFileStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;

public class TestApp
{
	public static void main(String[] inArgList)
	{
		try
		{
			ANTLRFileStream theFileStream = new
ANTLRFileStream("/home/glenmiller/tmp1/output/TestData2");
			Test1Lexer theLexer = new Test1Lexer(theFileStream);
			CommonTokenStream theTokenStream = new
CommonTokenStream(theLexer);
			Test1Parser theParser = new
Test1Parser(theTokenStream);
			theParser.test1();

		}
		catch (IOException inException)
		{
			inException.printStackTrace();
		}
		catch (RecognitionException inException)
		{
			inException.printStackTrace();
		}
	}
}

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address