[antlr-interest] Having problems with whitespace...

Cameron Esfahani dirty at apple.com
Mon Jun 25 21:06:04 PDT 2007


I'm trying to implement a grammar based on JSON.  It has a few  
additions that JSON doesn't.  I started this before Richard Clark  
added his JSON grammar to the wiki.  Since I've started having my  
problems, I have integrated some of his ideas into my grammar.

One difference in my grammar is that whitespace is important.  Not  
necessarily the amount, but where it can be placed.  For example,  
with most of the grammars I've seen in the book or on the wiki, an  
input of "100 1" and "1001" would be the same thing, since whitespace  
is usually shuttled off to channel HIDDEN.  For me, I want an error  
to occur if you type "100 1", instead of "1001".  Another difference  
is that I added support for the C-style block commenting mechanism  
of /* ... */.

So, in my thinking, I would NOT push whitespace to the HIDDEN  
channel.  I would indicate, in my grammar, where whitespace would be  
appropriate.

But, now I'm having some trouble because, for some input, I'm getting  
no viable alternative errors.  And, for the life of me, can't figure  
out why.  I'm hoping someone here might have some ideas.

The following input to my grammar works within the ANTLRWorks debugger:

tree = {
	"fletch2" : "not so good"
}

But, this one doesn't:

tree = {
	"fletch2" : "not so good"
/* comment */
}

Here is a stripped down version of my grammar:

grammar testT;

options {
	output = AST;
}

tokens {
	T_ASSIGN;
	T_STR;
	T_OBJ;
	T_DEF;
	T_SYMREF;
}

WS
	: ( ' ' | '\n' | '\r' | '\t' | '\u000C' )+
	;

COMMENT
	:   '/*' ( options { greedy = false; } : . )* '*/' { $channel =  
HIDDEN; }
	;

fragment LETTER
	:	'a'..'z'
	|	'A'..'Z'
	;

STRING
	:	'"' ( EscapeSequence | ~( '\u0000'..'\u001f' | '\\' | '\"' ) )* '"'
	;

fragment EscapeSequence
	:	'\\' ( 'b' | 't' | 'n' | 'f' | 'r' | '\"' | '\'' | '\\' )
	;

IDENTIFIER
	: LETTER ( LETTER | '-' | '_' | '0'..'9' )*
	;

r
	:	statement* EOF
	;

statement
	:	assignment -> assignment
	|	WS ->
	;

assignment
	:	IDENTIFIER WS? '=' WS? value -> ^( T_ASSIGN IDENTIFIER value )
	;

object
	:	'{' WS? members WS? '}' -> members
	;

members
	:	( pair WS? ',' WS? ) => pair ( WS? ',' WS? pair )+
	|	pair
	;

pair
	:	STRING WS? ':' WS? value -> ^( T_DEF STRING value )
	;

value
	:	STRING -> ^( T_STR STRING )
	|	object -> ^( T_OBJ object )
	;


Cameron Esfahani
dirty at apple.com

"Even paranoids have enemies."

Henry Kissinger



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070625/8ea02016/attachment.html 


More information about the antlr-interest mailing list