[antlr-interest] Tokenizing question

Jonathan Thomas jonathan.thomas at ca.com
Tue Jul 24 19:17:14 PDT 2007


Your TT rule also matches whitespace.
A quick fix - make your WS a fragment, and add | WS to your TT rule, so 
it also doesn't match the whitespace.


Andrew Lentvorski said the following on 25/07/2007 12:03 PM:
> I'm trying to get the following input:
> ( a1 = ( a2 = v2 ) )
>
> To tokenize like this:
>
> '(' MWS TT="a1" MWS '=' MWS '(' MWS TT="a2" MWS '=' MWS TT="v2" MWS 
> ')' ')'
>
> So, when I use this grammar:
>
> grammar test;
> tuple    :    '(' MWS? TT MWS? '=' MWS? (tuple | TT) MWS? ')';
>
> MWS    :    WS+;
> WS    :    (' '|'\t'|'\n'|'\r');
> TT    :    ~('(' | '=' | ')')+;
>
> Why is it tokenizing like this:
>
> '(' TT=" a1 " '=' MWS '(' TT=" a2 " '=' TT=" v2 " ')' ')'
>
> Just to forestall the question, no, you may *NOT* skip or disable
> whitespace.
>
> Sorry for the stupid newbie questions, but I'm trying to use ANTLR for
> data format translation and the idioms for language translation don't
> seem to work so well.
>
> -a
>


More information about the antlr-interest mailing list