[antlr-interest] Multiple lexer tokens per rule

Ken Williams ken.williams at thomsonreuters.com
Thu Jun 3 14:57:41 PDT 2010



On 6/3/10 4:18 PM, "Jim Idle" <jimi at temporal-wave.com> wrote:

> Add to an array or collection then get nextToken to remove from the
> collection. It si slower to do this so it isn't the default way.

Yeah, that's what the book says. =)

It seems like there are some subtleties involved, though - there's a lot of
bookkeeping in nextToken() that looks kind of scary (e.g. the
current-line-number stuff, the default-channel stuff, etc.), and if I
override it I'm really not confident I'll do it correctly.  I'm also unsure
how mTokens(), emit(), and nextToken() cooperate with their member
variables.

I tried this simple-minded implementation, and started getting out-of-bounds
exceptions:

@lexer::members {
    List<Token> tokBuf = new ArrayList<Token>();
    public Token nextToken() {
        while (tokBuf.isEmpty()) {
            emit();
        }
        return tokBuf.remove(0);
    }
    public void emit(Token token) {
        tokBuf.add(token);
    }
}


So if someone does have a working example, I'd love to see it!

-- 
Ken Williams
Sr. Research Scientist
Thomson Reuters
Phone: 651-848-7712
ken.williams at thomsonreuters.com




More information about the antlr-interest mailing list