[antlr-interest] custom TokenLabelType and EOF/Error tokens

David-Sarah Hopwood david-sarah at jacaranda.org
Tue Nov 10 11:45:59 PST 2009


Bob Frankel wrote:
> following idioms i've seen posted elsewhere, i've created a MyToken
> class that inherits from CommonToken; my Lexer then overrides Token
> Lexer.emit(), at which time i create instances of MyToken....
> 
> things work fine, until the parser encounters a syntactic error.... 
> from what i can tell, the parser inserts an "error" token of type
> CommonToken; this then leads to a class cast exception in the
> surrounding parser rule when attempting to assign a CommonToken value
> through a generated (MyToken) cast....

There are two possible causes of this:

a) There was a bug in ANTLR up to version 3.1.3 where the generated code
   would sometimes create tokens using 'new CommonToken', even when the
   TokenLabelType option is set to MyToken. See the thread at
<http://www.antlr.org/pipermail/antlr-interest/2009-July/thread.html#35129>.

   I think this has been fixed in ANTLR 3.2 (although not in the way I
   suggested that would also have fixed point b) below).
   Note your token class will need to have the same constructors as
   CommonToken in order for the fix to work.

b) When the getMissingSymbol method of a parser inserts a token in order to
   recover from an error, the inserted token is of type CommonToken.
   Override it as follows (this implementation also fixes a different bug
   that can cause a NullPointerException):

@parser::members {
  /**
   * Work around an ANTLR bug that causes a NullPointerException when trying
   * to recover at the end of the input stream. Also ensure that inserted
   * tokens are of type MyToken.
   */
  @Override protected Object getMissingSymbol(IntStream input,
      RecognitionException re, int expectedTokenType, BitSet follow) {
    String tokenText = null;

    if (expectedTokenType == Token.EOF) {
      tokenText = "<missing EOF>";
    } else if (expectedTokenType >= 0 &&
               expectedTokenType < getTokenNames().length) {
      tokenText = "<missing " + getTokenNames()[expectedTokenType] + ">";
    } else {
      throw new Error("invalid expectedTokenType " + expectedTokenType);
    }

    MyToken t = new MyToken(expectedTokenType, tokenText);
    Token current = ((TokenStream) input).LT(1);
    if (current == null || current.getType() == Token.EOF) {
      current = ((TokenStream) input).LT(-1);
    }
    if (current != null) {
      // If there are any other position-related fields in your MyToken
      // class, set them here.
      t.setLine(current.getLine());
      t.setCharPositionInLine(current.getCharPositionInLine());
    }
    t.setChannel(DEFAULT_TOKEN_CHANNEL);

    return t;
  }
}

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20091110/e37fd4fa/attachment.bin 


More information about the antlr-interest mailing list