[antlr-interest] lexer rule attributes
Terence Parr
parrt at cs.usfca.edu
Wed Nov 1 14:59:41 PST 2006
Hi,
Lexer rules always have an implicit return value of type Token that
is sent back to the parser, however, lexer rules that refer to other
lexer rules may access those portions of the overall token matched by
the other rules and returned as implicit tokens. The following rule
illustrates a composite lexer rule that reuses another token definition.
PREPROC_CMD
: '#' ID {System.out.println("cmd="+$ID.text);}
;
ID : ('a'..'z'|'A'..'Z')+
;
Lexer (non-fragment) rules may also contain actions that access
attributes of the surrounding rule itself. Code generated for rules
begins with a preamble that sets the predefined attributes:
ruleNestingLevel++;
int type = <standin>ruleTokenType</standin>;
int start = getCharIndex();
int line = getLine();
int charPosition = getCharPositionInLine();
int channel = Token.DEFAULT_CHANNEL;
BUT, do we want to say $text, $line, etc... for consistency? It
means adding a bunch more templates to handle these predefined
attributes. $line is translated to line etc... $text however needs
to be getText(). Hmm...should lexer rules be treated differently?
Ter
More information about the antlr-interest
mailing list