[antlr-interest] [3.1.1] ANTLR3_MIN_TOKEN_TYPE define possibly incorrect

Sven Van Echelpoel sven.van.echelpoel at empolis.com
Wed Mar 25 01:16:06 PDT 2009


[...]
> > >     
> > Well, I'm using it now or I wouldn't have noticed it :-) But maybe I
> > shouldn't and there's a better way of doing it. I'm trying to figure out
> > that the node returned by LT( 1 ) is a valid node (one of the nodes
> > created by my grammar). So I check that the type >=
> > ANTLR3_MIN_TOKEN_TYPE
> > 
> > Would that be correct?
> >   
> Probably not. By definition the token types of the nodes can only be
> of the types you specify in the lexer or the parser, so unless you
> have your own code setting the types then there is no way that you can
> get an token type that isn't defined. What is it that you are trying
> to do?
OK, I'm rewriting an AST and I can't use the plain rewrite rules as I
need some substantial logic to determine which nodes to create. That
decision, for one, depends on the next node. To that end I'm using the
fallback rewrite rule in which I call a function. Something akin to
this:

rewrite_this
   :  ^( node=NEEDS_TO_BE_REWRITTEN b=body )
     -> { createNewNode( $node, $b, LT( 1 ) ) }
   ;


In createNewNode I use $node, $b and the next sibling of $node to
determine what to do. Now I found that when $node has no next sibling,
as in the situation (2) below, LT(1) still returns a node (I would have
expected NULL).

|
+- NEEDS_TO_BE_REWRITTEN   <-- (1)  LT(1) on this one is fine,
|    |                              it returns (2)
|    +- SOME_BODY_NODE
|
+- NEEDS_TO_BE_REWRITTEN   <-- (2) LT(1) on this one also returns a node
     |
     +- SOME_BODY_NODE

The type of this node is 3 or ANTLR3_TOKEN_UP, which, from the comments
in antlr3commontoken.h seems to be an imaginary token that signals the
end of the stream.

At first I only checked that LT(1) returned something non-NULL, but
since a node was returned in (2) I ended up creating the wrong node to
return in the rewrite rule. Then I found out that the type returned was
ANTLR3_TOKEN_UP and that I could use ANTLR3_MIN_TOKEN_TYPE to determine
whether I had a 'valid' node or not.

Should LT(1) return a node in (2), or does that signal that something's
amiss? If the behavior of LT(1) is correct, how can I determine that a
node has no next sibling?

Sven




More information about the antlr-interest mailing list