[antlr-interest] How to set imaginary token text?
Randall R Schulz
rschulz at sonic.net
Mon Jul 16 12:20:56 PDT 2007
On Monday 16 July 2007 12:04, Vaclav Barta wrote:
> On Sunday 15 July 2007 20:00, Vaclav Barta wrote:
> > On Sunday 15 July 2007 19:07, Randall R Schulz wrote:
> > > ...
> > > You might want to consider consolidating these characters, if
> > > that would work for your purposes:
>
> Experimenting some more, maybe I'd like to parse (some of) these
> characters individually but consolidate them into one AST node -
> something like
Let me clarify that it is at the lexical level that a
token-per-character approach incurs potentially excessive overhead. For
example, a whitespace rule that matched single white-space characters
vs. one that collected them together could make a large difference in
the number of Tokens constructed for a given input text.
> quotedString returns [ String value ]
> @init { StringBuffer sb = new StringBuffer(); }
>
> : DQUOTE (
> EscapeSequence { sb.append($EscapeSequence.getText()); }
> | BareString { sb.append($BareString.getText()); }
> )* DQUOTE { $value = sb.toString(); }
> ;
>
> string
> : s = quotedString -> LITERAL
> | BareString -> LITERAL
>
> ;
>
> where LITERAL is an imaginary token - but as written, it obviously
> loses the string value. How can I set LITERAL token text to the value
> returned from quotedString, or $BareString.getText() ?
Do you have TDAR (The Definitive ANTLR Refernce)? If so, on page 176
(paper) or page 188 (PDF), the notation for incorporating token
references and / or token text into imaginary nodes is specified.
I have not used this mechanism, so I'm reluctant to try to either
paraphrase or rewrite your grammar using these notations. Perhaps
someone who knows better will supply the appropriate rules.
> Bye
> Vasek
Randall Schulz
More information about the antlr-interest
mailing list