[antlr-interest] White space

John B. Brodie jbb at acm.org
Wed Sep 1 13:19:32 PDT 2010


Greetings!

On Wed, 2010-09-01 at 11:30 -0700, Alex Rodriguez wrote:
> Greetings,
> 
> Given a very simple grammar for a language that only has an 'if'
> statement, I would like to be able to parse white space within literal
> values. So far, this works (case 1):
> 
> if(value=='white space'){doThis('arg')}
> 
> But this doesn't work (case 2):
> 
> if (value == 'white space') { doThis('arg') }

because you have permitted blanks in an ID, the string "if " above is an
ID under your rules and *NOT* the keyword 'if' followed by a blank.

> 
> Note that case 2 is spaced for readability.
> 
> Debugging case 2 in ANTLRWorks produces a MismatchedTokenException.
> 
> What is the best way to handle both cases? Here is the grammar:

move your literalValue rule into the lexer and take out the blank from
ID.

see attached for the way i would change your grammar to solve this
issue.

hope this helps...
   -jbb


> 
> grammar Lang;
> 
> statement
> 	:
> 		'if' LPAREN ID EQ literalValue RPAREN '{' action '}'
> 	;
> 
> literalValue
> 	:
> 		'\'' ID '\''
> 	;
> 	
> action
> 	:
> 		ID LPAREN literalValue RPAREN
> 	;
> 		
> ID
> 	:
> 		('a'..'z' | 'A'..'Z' | '0'..'9' | '@' | ':' | '_' | ' ' | '+')+
> 	;
> 
> LPAREN
> 	:
> 		'('
> 	;
> 	
> RPAREN
> 	:
> 		')'
> 	;
> 	
> EQ
> 	:
> 		'=='
> 	;
> 	
> WS
> 	:
> 		(' ' |'\t' |'\r' |'\n' )+ { $channel=HIDDEN; }
> 	;

-------------- next part --------------
grammar Test;

options {
   output = AST;
   ASTLabelType = CommonTree;
}

@members {
   private static final String [] x = new String[] {
      "if(value=='white space'){doThis('arg')}",
      "if (value == 'white space') { doThis('arg') }"
   };

   public static void main(String [] args) {
      for( int i = 0; i < x.length; ++i ) {
         try {
            System.out.println("about to parse:`"+x[i]+"`");
            TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            System.out.println("tokens:"+tokens.toString());

            TestParser parser = new TestParser(tokens);
            TestParser.start_return p_result = parser.start();

            CommonTree ast = p_result.tree;
            if( ast == null ) {
               System.out.println("resultant tree: is NULL");
            } else {
               System.out.println("resultant tree: " + ast.toStringTree());
            }
            System.out.println();
         } catch(Exception e) {
            e.printStackTrace();
         }
      }
   }
}

start : statement EOF!;

statement : IF_KW LPAREN ID EQ literalValue RPAREN LBRAK action RBRAK ;

literalValue : STRING ;

action : ID LPAREN literalValue RPAREN ;
                
IF_KW : 'if'  ; 

fragment LETTER : 'a'..'z' | 'A'..'Z' ;
fragment DIGIT : '0'..'9' ;
ID : LETTER (LETTER | DIGIT | '@' | ':' | '_' | '+')* ;

LPAREN : '(' ;
RPAREN : ')' ;

LBRAK : '{' ;
RBRAK : '}' ;
        
EQ : '==' ;
        
STRING : '\'' ( options{ greedy=false; }: ~('\'') )* '\'' ;
        
WS : (' ' |'\t' |'\r' |'\n' )+ { $channel=HIDDEN; } ;


More information about the antlr-interest mailing list