[antlr-interest] Rookie problem

Gavin Lambert antlr at mirality.co.nz
Fri Apr 4 00:58:11 PDT 2008


At 04:00 4/04/2008, Marko Simovic wrote:
>The grammar at the end of this e-mail works fine for variable 
>names without spaces. If i change the 'name' definition to the 
>following:
>
>name: String (' ' String)*;
>
>then the 'if' statement can no longer be recognized. What am i 
>doing wrong?

If that was a complete grammar, then you're missing a whitespace 
rule.  If a character (such as whitespace) isn't referred to by 
any lexer rule then ANTLR will by default output an error and then 
drop the character and move on, which is why your 'if' rule would 
have been "working" originally.

As soon as you add the space to the 'name' rule, space becomes a 
valid input character and it will start generating space tokens, 
which means that for the input "if foo then bar" you'll now get 
'if',' ','foo',' ','then',' ','bar' instead of 
'if',(error),'foo',(error),'then',(error),'bar'.  And since your 
'selection' rule doesn't match spaces, it won't be able to match 
any more.

The normal solution is to add a WS rule and make it hidden; 
however if you do that then no WS tokens will be visible to the 
parser so your 'name' rule won't be able to match spaces 
anyway.  But you shouldn't need to explicitly specify a space, 
assuming that any amount of whitespace is permitted between the 
words in your multi-word names; the simple fact that multiple 
tokens were generated proves that there was something that broke 
the tokens up.

This is just off the cuff (I haven't tested it), but something 
like this ought to work:

grammar test;

String : ('a'..'z' | 'A'..'Z')+;

ConditionOperator : '<' | '>';

WS : (' ' | '\r' | '\n')+ { $channel = HIDDEN; };

name : String+;

condition : name (ConditionOperator name)*;

selection : 'if' condition 'then' condition;

statement : selection;



More information about the antlr-interest mailing list