[antlr-interest] Lexer escape conversions

Curtis Clauson NOSPAM at TheSnakePitDev.com
Fri Feb 23 20:02:55 PST 2007


Why is it I only come up with an answer after I send the question? <chuckle>

Despite no error messages from Antlr, the lexer will not generate return 
values for any rule. There seems to be no way, outside of lexer instance 
variables, to pass information from a sub-rule to a parent rule.

This means I either have to integrate the Escape fragment into the Word 
rule which is clumsy and ugly, or split the Escape sequences into 
individual fragments that the Word rule can associate with text 
appending actions.

Both work, and the latter seems less clumsy, but I would really love to 
know if this Antlr behavior is a bug since it stops me from using a 
cleaner and clearer source.

This is the version that works:
----------
/*
  * Parser Rules
  */

content
@init {System.out.println("Words:");}
     :   (
             Word  {System.out.println("  " + $Word.text);}
         )*
     ;


/*
  * Lexer Rules
  */

// Gather sequences of letters and escapes between whitespace
// into words.
Word
@init {text = "";}
     :   (
             Letters  {text += $Letters.text;}
         |   Escape1  {text += '@';}
         |   Escape2  {text += '#';}
         |   Escape3  {text += '$';}
         )+
     ;
Whitespace
     :   (' ' | '\t' | '\u000B' | '\f' | '\r' | '\n')+
         {$channel = HIDDEN;}
     ;

fragment Escape1    : EscapeFlag 'a';
fragment Escape2    : EscapeFlag 'b';
fragment Escape3    : EscapeFlag 'c';
fragment EscapeFlag : '@';
fragment Letters    : Letter+;
fragment Letter     : 'a'..'z' | 'A'..'Z';
----------

-- 
--------------------------------------------------------
"Any sufficiently over-complicated magic is indistinguishable from 
technology." -- Llelan D.



More information about the antlr-interest mailing list