[antlr-interest] Generating Fake Lexical Tokens
Robert Colquhoun
rjc at trump.net.au
Wed Sep 25 04:23:53 PDT 2002
At 03:14 PM 18/09/2002 +0000, shyamgopale wrote:
> Consider the Python program
> if test:
> print "something"
> do_something()
> # Outside if
> do_somethingmore()
>Now for the above program - The lexer needs to generate
>an INDENT token before the print to let the parser
>know that the following statements are part of an
>if block. And similarly it needs to generate a DEDENT
>token after do_something() to indicate end of the if
>block.
> I have the logic to generate the INDENT and DEDENT
>tokens but I have no idea how to make the lexer report
>them before or after the real tokens. Can anyone help
>me out with this. I am looking for a way to insert
>additional tokens in the token stream.
Just off the top of my head could you do want you want with a Token Stream
class
ie something like:
public class IndentFilterStream implements TokenStream {
protected TokenStream lexer = null;
protected int level = 0;
public IndentFilterStream(TokenStream in) {
lexer = in;
}
public Token nextToken() throws TokenStreamException {
Token t = lexer.nextToken();
if (t.getType() == NEWLINE) {
Token t2 = lexer.getToken();
if (t2.getType() == WHITESPACE) {
int len = t2.getText().getLength();
if (len > level) t2 = new Token(INDENT);
if (len < level) t2 = new Token(DEDENT);
level = len;
}
return
t2;
}
return t;
}
}
The above has some problems in that you would lose NEWLINE and some
WHITESPACE tokens to the parser, but it is a start......
- Robert
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list