[antlr-interest] Dumping out lexer token stream?

Randall R Schulz rschulz at sonic.net
Sat Jun 23 09:22:40 PDT 2007


On Saturday 23 June 2007 01:28, Wincent Colaiuta wrote:
> El 23/6/2007, a las 3:41, Cameron Esfahani escribió:
> > To help with my debugging, I would like to see the tokenized output
> > from the lexer.  Before the parser gets a chance at, well, parsing
> > it.
> >
> > I can't seem to find anything in ANTLRWorks which will do this.
> > Does anyone have any suggestions?
> >
> > Cameron Esfahani
> > dirty at apple.com
>
> Normally the lexer is invoked automatically by the parser, which
> repeatedly calls the "next token" method/function. So you can do the
> same and watch the token stream that way. For example, in the C
> target, something like the following (assuming you lexer is in the
> variable "lexer"):

Oddly enough, I wanted to do exactly the same right now when I've only
written the lexical portion of my grammar.

I wrote this test code (use a fixed-width font, of course):

  CLIFLexer           lexer       = null;
  PrintStream         out         = System.out;

  try {
    lexer = new CLIFLexer(new ANTLRFileStream(fileName));
  }

  catch (IOException exIO) {
    System.err.printf("CLIF: Cannot open file \"\%s\"\%n", fileName);
    return;
  }


  out.format("\%nParsing \"\%s\"\%n", fileName);

  TokenStream         tokens      = new CommonTokenStream(lexer);
  int                 nTokens     = tokens.size();

  for (int iToken = 0; iToken < nTokens; iToken++) {
    Token             token       = tokens.get(iToken);

    out.format("\%6d: \%4d.\%3d: T\%3d-C\%3d; \"\%s\"\%n",
               iToken,
               token.getLine(), token.getCharPositionInLine(),
               token.getType(), token.getChannel(),
               token.getText());
  }


When I apply this to a file with lots of source code that matches the
lexical grammar I've defined, I always get an nTokens value of 0.

The JavaDoc comment on CommonTokenStream implies that it will scan the
entire input and build a sequence of tokens in advance, yet that does
not seem to be happening.

And no exception is thrown (unless the file name is not valid).


What am I missing / doing wrong?


> ...
>
> Cheers,
> Wincent


Randall Schulz


More information about the antlr-interest mailing list