[antlr-interest] ANTLR 3 & hidden token management?

Terence Parr parrt at cs.usfca.edu
Wed Aug 9 12:57:15 PDT 2006


On Aug 8, 2006, at 6:47 AM, David Holroyd wrote:

> On Mon, Aug 07, 2006 at 02:54:28PM -0700, Terence Parr wrote:
>> On Aug 6, 2006, at 1:17 PM, David Holroyd wrote:
>>> I can see how that would give access to {channel=99} tokens
>>> produced by
>>> lexing and parsing.  However, I want to be able to insert  
>>> 'synthetic'
>>> nodes into the AST / token stream, as I'm building an API for
>>> programmatic code generation.
>>
>> Ah!  okay, hmm....  I guess there is no input token number associated
>> with these nodes.
>
> If I understand ANTLR3's Trees correctly, I think I would need to add
> tokens to the stream as well as nodes into the tree, as not all the
> nodes are imaginary (the real nodes need to appear when the tree is
> eventually pretty-printed)...

using the -> rewrite notation makes it easy to add these, but if  
you're doing manually, I believe you can just create an imaginary  
token to hold your text/type.  See ClassicToken I think.

>> If it is the root node, then ANTLR will
>> automatically set the start/stop indexes.  If the imaginary node is
>> in the middle of the tree somewhere then yes you would have to update
>> those indexes yourself.  Are you asking about inserting nodes after
>> the fact of tree construction? if so, I simple possibly recursive
>> function will handle the token update per my article on the website
>> about tracking token indices with version 2.
>
> I have a sort-of Document Object Model for my target language, where
> each DOM node holds a reference to the relevant node within the AST.
>
> So, a user of the DOM can enumerate the method-defs within a class-def
> (which was maybe parsed from a file), but then they can use that
> method-def object to append additional statements to the method body,
> e.g.:
>
>   meth.addStmt("return foo;");
>
> So the implementation of addStmt() calls in to the Parser's  
> 'statement'
> production, and links the resulting AST subtree back into the
> compilation unit.
>
> In other cases, I don't bother using the Parser to construct the AST
> subtree, and just instantiate nodes directly.  For instance, this code
> implements the addition of a default: label to a switch-statement  
> in the
> target language:
>
>   public StatementContainer newDefault() {
>     AST defaultStmt = ASTUtils.newAST(AS3TokenTypes.LITERAL_default,
>                                       "default");
>     ast.addChild(defaultStmt);
>     return new ASTStatementList(defaultStmt);
>   }

I see.  Well, that makes it harder to use the token stream as you  
say.  Perhaps you are better off replicating the hiddentoken linked  
lists of 2.7.x?

> My planned typical usage of this DOM API will actually construct the
> entire compilation unit using lots of invocations of that type, so  
> I'll
> need to walk the entire AST on virtually every method call.
>
> Well, maybe that's fine... better not optimise too early, eh? ;)

:)

Ter


More information about the antlr-interest mailing list