[antlr-interest] Trouble parsing a language where '{' has too	many meanings
    Richard Clark 
    rdclark at gmail.com
       
    Fri Jul  6 15:42:38 PDT 2007
    
    
  
Try changing the definition for ML_TEXT to put the closing element in
a single string.
ML_TEXT
   :    '{'
       ( options {greedy=false;} : . )*
       '}.'
   ;
The lexer doesn't do backtracking, so under the old definition it
would see  {...} and match it before seeing the "." Automatic error
recovery would throw awayy the dot as unrecognized (and give an
error.)
Pulling the closing bracket and dot together '}.' means they'll be
recognized as a unit.
Run the following in ANTLRWorks' debugger to see it working:
grammar multiBlock;
top	: (block | comment)* ;
comment	: ML_TEXT;
block	: BLOCK ;
ML_TEXT
   :    '{'
       ( options {greedy=false;} : . )*
       '}.'
   ;
 BLOCK	: '{' ('A'..'Z'|'a'..'z'|' ')* '}' ;
 ...Richard
P.S. Remember that the first rule to match in the lexer wins.
    
    
More information about the antlr-interest
mailing list