[antlr-interest] Trouble parsing a language where '{' has too many meanings
Richard Clark
rdclark at gmail.com
Fri Jul 6 15:42:38 PDT 2007
Try changing the definition for ML_TEXT to put the closing element in
a single string.
ML_TEXT
: '{'
( options {greedy=false;} : . )*
'}.'
;
The lexer doesn't do backtracking, so under the old definition it
would see {...} and match it before seeing the "." Automatic error
recovery would throw awayy the dot as unrecognized (and give an
error.)
Pulling the closing bracket and dot together '}.' means they'll be
recognized as a unit.
Run the following in ANTLRWorks' debugger to see it working:
grammar multiBlock;
top : (block | comment)* ;
comment : ML_TEXT;
block : BLOCK ;
ML_TEXT
: '{'
( options {greedy=false;} : . )*
'}.'
;
BLOCK : '{' ('A'..'Z'|'a'..'z'|' ')* '}' ;
...Richard
P.S. Remember that the first rule to match in the lexer wins.
More information about the antlr-interest
mailing list