[antlr-interest] Multiple lexer tokens per rule

Fri Jun 4 15:45:13 PDT 2010

Ken Williams wrote:
> 
> 
> On 6/4/10 4:16 PM, "Junkman" <j at junkwallah.org> wrote:
>> The way nextToken() is overriden, it first returns the token matched by
>> the rule, and subsequently any additional queued token before matching a
>> new token in the input stream.
> 
> Maybe I'm being dense here, but I don't think that's what it's doing:
> 
>     public Token nextToken() {
>         return tokenQueue.isEmpty() ? super.nextToken() : tokenQueue.poll();
>     }
> 
> If tokenQueue() is non-empty, it always uses it.  On the *next* invocation,
> when it's empty, it will call super.nextToken().
> 
> 

Think of tokens generated by a single rule invocation as a set.  The set
is generated in/under "super.nextToken()", AFTER the queue has been
tested to be empty.  Among the tokens in the set, the "matching" token
is returned first, because that's what Lexer.nextToken()
("super.nextToken()") returns.

If that's still not clear, I suggest you put the generated lexer under a
debugger (like Jim suggested in another thread ;-) and trace it from
nextToken() - will give you better explanation than my verbiage.

Best regards.