[antlr-interest] Embedded token stream technique

FranklinChen at cmu.edu FranklinChen at cmu.edu
Thu Apr 29 11:09:22 PDT 2004


I asked earlier in several messages about a variety of lexer
techniques I tried in order to do what I need to do... I never got any
feedback about any of it, but I think I have come up with an
interesting hack, and am interested whether it introduces unforeseen
problems:

Say I am parsing and currently use a lexer that has a token FLOAT,
which is FLOAT_INTEGER DOT FLOAT_FRACTION (assume all protected).  Now
assume that my parser wants to get the subparts of the token.  (Never
mind that in this toy example this makes no sense: in real examples,
where the token is actually recursive, I do want access to subparts of
a token, for reparsing in the right context.)

I've implemented a trick as follows (relevant excerpt
follows). Comments?


header {
    import java.util.LinkedList;
}

class FloatParser extends Parser;

options {
    buildAST = true;
}

tokens {
    NUMBER;
}

number
    :
        i:FLOAT_INTEGER
        DOT!
        f:FLOAT_FRACTION
        {
          #number = #([NUMBER], #number);
        }
    ;

class FloatLexer extends Lexer;

{
    /** Queue of tokens. */
    protected LinkedList insertedTokens = new LinkedList();
}

// Want to push tokens back on the stream in order to treat as
// "integer" followed by "fraction".
FLOAT
    :
        i:FLOAT_INTEGER
        {
            LinkedList queue = new LinkedList();
            queue.add(i);
        }
        d:DOT
        {
            queue.add(d);
        }
        f:FLOAT_FRACTION
        {
            queue.add(f);
        }

        {
            // Have to manually kick off the token stream.
            Token token = (Token) queue.removeFirst();
            $setToken(token);
            System.err.println("*** inserting tokens: " + queue);
            insertedTokens.addAll(queue);
        }
    ;

protected
FLOAT_INTEGER
    :
        ('0'..'9')+
    ;

protected
FLOAT_FRACTION
    :
        ('0'..'9')+
    ;

protected
DOT
    :
        '.'
    ;


import java.io.InputStream;
import java.io.Reader;

import antlr.Token;
import antlr.TokenStreamException;
import antlr.InputBuffer;
import antlr.LexerSharedInputState;


/**
 * Use queue of tokens before the existing token stream.
 */
public class InsertedFloatLexer extends FloatLexer {
    public InsertedFloatLexer(InputStream in) {
        super(in);
    }

    public InsertedFloatLexer(Reader in) {
        super(in);
    }

    public InsertedFloatLexer(InputBuffer ib) {
        super(ib);
    }

    public InsertedFloatLexer(LexerSharedInputState state) {
        super(state);
    }

    public Token nextToken() throws TokenStreamException {
        if (insertedTokens.size() == 0) {
            return super.nextToken();
        }
        else {
            return (Token) insertedTokens.removeFirst();
        }
    }
}


import antlr.TokenStream;
import antlr.Token;
import antlr.TokenStreamException;

public class FloatDriver {
    public static void main(String[] args) {
        try {
            InsertedFloatLexer lexer = new InsertedFloatLexer(System.in);
            FloatParser parser = new FloatParser(lexer);

            parser.start();
        }
        catch (Exception e) {
            System.err.println(e.getMessage());
        }
    }
}



-- 
Franklin


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list