[antlr-interest] Unquoting strings

Johannes Luber jaluber at gmx.de
Wed May 14 13:28:48 PDT 2008


Daniel Danciu schrieb:
> Browsing through some tutorials, I was left with the impression that 
> appending an exclamation mark to a character would remove that character 
> from the parsed token, e.g.:
> 
> fragment
> SingleQuotedString
>   :
>   '\''! // or single quoted string
>   ( '\\'! '\''
>   | ~('\''|'\n'|'\r')
>   )*
>    '\''!;
> 
> 
> 
> Would cause the enclosing quotes to be removed. This seems to not be 
> happening in the Java generated code, so I had to resort to the 
> following ugly hack, which manually removes the quotes:
> 
> STRING
> : (DoubleQuotedString | SingleQuotedString)
>   {
>           // Strip the surrounding quotes
>           String txt = getText();
>           setText(txt.substring(1, txt.length() -1));
>   };
> 
> This works, but it's not nice. Does anybody know what I might be doing 
> wrong in the SingleQotedString rule?

Currently, the documentation differs from the capabilities of ANTLR. It 
was supposed to work like in the tutorials, though. My suggestion for an 
implementation would be to expand the CommonToken class, that a token 
can not only take its text from the input file, but also switch to an 
internal string. Seems to be the easiest approach.

Johannes


More information about the antlr-interest mailing list