[antlr-interest] Explicitly setting the text for a lexer fragment

Sam Harwell sharwell at pixelminegames.com
Tue Oct 14 11:53:51 PDT 2008


I'm converting an antlr2 grammar (with Java target) to antlr3 (with C#
target), and I'm having some trouble with one of the antlr2 features.

 

Here is the fragment rule that's causing the problem. The commented part
is the antlr2 version, and the uncommented part is the antlr3 version.

 

//protected

//ESC_CHAR returns [char uc='\u0000']

//      :       "\\n"! {uc = '\n';}

//      |       "\\r"! {uc = '\r';}

//      |       "\\t"! {uc = '\t';}

//      |       "\\ "! {uc = ' ';}

//      |       "\\u"! a:HEX! b:HEX! c:HEX! d:HEX!

//              {uc =
(char)Integer.parseInt(a.getText()+b.getText()+c.getText()+d.getText(),
16);}

//      ;

fragment

ESC_CHAR

        :       '\\'

                (       'n' {$text = "\n";}

                |       'r' {$text = "\r";}

                |       't' {$text = "\t";}

                |       ' ' {$text = " ";}

                |       'u' a=HEX b=HEX c=HEX d=HEX

                        { $text = new
string((char)int.Parse($a.text+$b.text+$c.text+$d.text,
System.Globalization.NumberStyles.AllowHexSpecifier ), 1); }

                )

        ;

 

At first, I tried simply using a return value and setting it in the
rule:

 

fragment

ESC_CHAR returns [char uc='\0']

        :       '\\'

                (       'n' {$uc = '\n';}

                |       ...

                )

        ;

 

But I found out that lexer rules can't return values.

 

The problem I'm hitting now is setting the $text in a lexer fragment
rule whacks the text for the entire token (replacing even the text for
the rule that called the fragment rule). Any suggestions? Here's a
cropped version of the rule that's calling the fragment:

 

ACTION

@init

{

        StringBuffer buf = null;

}

        :       ('<\\') =>

                // Match escapes not in a string like <\n\ufea5>

                {

                        buf = new StringBuffer();

                }

                '<' (ESC_CHAR {buf.append($ESC_CHAR.text);} )+ '>'

                {

                        $text = buf.ToString();

                        $type = LITERAL;

                }

        |       ...

        ;

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20081014/44c19054/attachment.html 


More information about the antlr-interest mailing list