[antlr-interest] Tokenizing question

Andrew Lentvorski bsder at allcaps.org
Tue Jul 24 19:03:06 PDT 2007


I'm trying to get the following input:
( a1 = ( a2 = v2 ) )

To tokenize like this:

'(' MWS TT="a1" MWS '=' MWS '(' MWS TT="a2" MWS '=' MWS TT="v2" MWS ')' ')'

So, when I use this grammar:

grammar test;
tuple	:	'(' MWS? TT MWS? '=' MWS? (tuple | TT) MWS? ')';

MWS	:	WS+;
WS	:	(' '|'\t'|'\n'|'\r');
TT	:	~('(' | '=' | ')')+;

Why is it tokenizing like this:

'(' TT=" a1 " '=' MWS '(' TT=" a2 " '=' TT=" v2 " ')' ')'

Just to forestall the question, no, you may *NOT* skip or disable
whitespace.

Sorry for the stupid newbie questions, but I'm trying to use ANTLR for
data format translation and the idioms for language translation don't
seem to work so well.

-a


More information about the antlr-interest mailing list