[antlr-interest] Multiple lexer tokens per rule

Tue Jun 8 10:49:31 PDT 2010

In case anyone reads this thread again, Antlr wiki has a better example
for emitting multiple tokens:

http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497

Cheers.

Junkman wrote:
> Ken Williams wrote:
>>
>> On 6/4/10 4:16 PM, "Junkman" <j at junkwallah.org> wrote:
>>> The way nextToken() is overriden, it first returns the token matched by
>>> the rule, and subsequently any additional queued token before matching a
>>> new token in the input stream.
>> Maybe I'm being dense here, but I don't think that's what it's doing:
>>
>>     public Token nextToken() {
>>         return tokenQueue.isEmpty() ? super.nextToken() : tokenQueue.poll();
>>     }
>>
>> If tokenQueue() is non-empty, it always uses it.  On the *next* invocation,
>> when it's empty, it will call super.nextToken().
>>
>>
> 
> Think of tokens generated by a single rule invocation as a set.  The set
> is generated in/under "super.nextToken()", AFTER the queue has been
> tested to be empty.  Among the tokens in the set, the "matching" token
> is returned first, because that's what Lexer.nextToken()
> ("super.nextToken()") returns.
> 
> If that's still not clear, I suggest you put the generated lexer under a
> debugger (like Jim suggested in another thread ;-) and trace it from
> nextToken() - will give you better explanation than my verbiage.
> 
> Best regards.
>