[antlr-interest] referencing subrule result in embedded action (C# port)

Scobie Smith (Insight Global) v-scobis at microsoft.com
Sat Jun 30 22:39:21 PDT 2012


I have a question about referencing an element in an action embedded in a rule. I am using the C# port of ANTLR 3.



In a word, can references to elements be references to subrule results, or must references be only to tokens?



This rule works:



string

@init { StringBuilder sb = new StringBuilder(); }

: ( (tok=String) { sb.Append($tok.Text); } )+ -> { new CommonTree(new CommonToken(STRING, sb.ToString())) }

;



String: OtherCharacter+ ;

fragment OtherCharacter: ~('~'|'\t'|'\r') ;



It collects the text of multiple String tokens into one new token.

** Note that $tok refers to a token (String). **



But this rule does not:



string

@init { StringBuilder sb = new StringBuilder(); }

: ( tok=(String | Integer) { sb.Append($tok.Text); } )+ -> { new CommonTree(new CommonToken(STRING, sb.ToString())) }

;



Integer: Digit+ ;

String: OtherCharacter+ ;

fragment OtherCharacter: ~(SEP|'\t'|'\r'|EOL|Digit) ;

fragment Digit: ('0' .. '9' ) ;



It attempts to collect the text of multiple String OR Integer tokens into one new token.

For instance, input "foo123bar" will tokenize as three tokens, "foo", "123", "bar", and the rule will create one "foo123bar" token.

** Note that $tok refers to the result of a subrule (String | Integer). **

At run-time, the tok variable in the parser is always null. In fact, there is no statement in the generated parser that assigns a value to $tok.



So, I am wondering if references to elements must be to tokens only--referencing a subrule's result is incorrect. (But the grammar allows it.)

[The example/description in TDAR p. 137 (action embedded in a rule, inside * cardinality) references tokens, not subrules.]

Or generally I am wondering what I'm doing wrong in the second rule.



Thanks in advance for any tips or insight about this.



Scobie










More information about the antlr-interest mailing list