[antlr-interest] custom TokenLabelType and EOF/Error tokens

David-Sarah Hopwood david-sarah at jacaranda.org
Tue Nov 10 20:34:30 PST 2009


Bob Frankel wrote:
> thanks for the help....  recovery from inserted error tokens now works
> just fine, but i'm still finding my EOF token is of type CommonToken....

On closer investigation, it seems that this problem is due to
CommonTokenStream using Token.EOF_TOKEN, which is hardcoded to a
CommonToken.

<http://www.antlr.org/api/Java/_common_token_stream_8java-source.html#l00236>
(lines 248 and 260)
<http://www.antlr.org/api/Java/interfaceorg_1_1antlr_1_1runtime_1_1_token.html#a1b4524a52069a34b14a0761ea43423b>

It is possible to use your own token stream class in place of
CommonTokenStream (or TokenRewriteStream if you need rewriting).
No ANTLR option needs to be set; you just create an instance of
your token stream class in the usual boilerplate for creating a
lexer and parser.

If you are subclassing CommonTokenStream or TokenRewriteStream,
I think it should be sufficient to override the LT method as follows:

   protected static MyToken MY_EOF_TOKEN = new MyToken(CharStream.EOF);

   @Override public Token LT(int k) {
     Token t = super.LT(k);
     return t != Token.EOF_TOKEN ? t : MY_EOF_TOKEN;
   }

(The EOF_TOKEN doesn't actually exist in the token stream; it is
returned only when you look ahead using LT.)

However, I haven't tested this, and I don't know whether there are
any other places where CommonToken is hardcoded.

> i'm currently using 3.1.3 (since that's what my eclipse ide
> supports)....  is there is workaround for 3.1.3???

The workaround I used was:

 - Delete the TokenLabelType option;

 - Change my code so that it no longer assumes that all tokens are
   instances of MyToken. Emitted and error tokens will still be of
   class MyToken, but fragment and EOF tokens might not be.
   (Fragment tokens only occur if you refer to a named child fragment
   in a lexer rule. To check that you're not doing this, search for
   "new CommonToken" in the generated lexer.)

This is obviously quite ugly, although you might be able to clean up
some of the instanceof tests and casting by using a convenience method
such as

   public static MyToken of(Token t) {
     return t instanceof MyToken ? (MyToken) t : new MyToken(t);
   }

It also might not be a sufficient workaround depending on why you are
overriding the token type. (In my case, it turned out to be sufficient
for emitted and error tokens to be of the overridden type.)

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20091111/08a76654/attachment.bin 


More information about the antlr-interest mailing list