[antlr-interest] Tokens vs. Characters in Lexer/MismatchedTokenException
Rick Mann
rmann at latencyzero.com
Sun Jan 18 13:10:19 PST 2009
On Jan 18, 2009, at 12:55:46, Terence Parr wrote:
>
> On Jan 18, 2009, at 12:38 PM, Rick Mann wrote:
>
>> As I'm working on my language target, I see that Lexer.match(int c)
>> in
>> the Java target can create a MismatchedTokenException(), passing c in
>> to its constructor.
>
> weird. in the Java version, it treats it as a token type.
Sorry I wasn't clear: I'm referring to the Java version. As I examine
this further, I realize it's combining Characters and Token *Types*,
not Tokens. This is still a little apples-and-oranges to me.
Lexer has two match() methods:
Lexer.match(String s)
Lexer.match(int c)
When the ANTLR tool builds the Java recognizer for the example grammar
in the codegen wiki page, it creates a rule mZERO() in the Lexer
subclass that calls match('0'). At this point we're passing a char as
an int parameter, which should be legal without warnings. Examining
Lexer.match(int c) reveals this:
public void match(int c) throws MismatchedTokenException {
if ( input.LA(1)!=c ) {
if ( state.backtracking>0 ) {
state.failed = true;
return;
}
MismatchedTokenException mte =
new MismatchedTokenException(c, input);
recover(mte); // don't really recover; just consume in lexer
throw mte;
}
input.consume();
state.failed = false;
}
The only no-arg constructor of MismatchedTokenException is:
public MismatchedTokenException(int expecting, IntStream input) {
super(input);
this.expecting = expecting;
}
And this.exception is declared like this:
public int expecting = Token.INVALID_TOKEN_TYPE;
Implying that we've now converted a character to a token type
(semantically, that is).
>
>
>> The exception class seems to treat that int as a token. I wouldn't
>> have thought Tokens and Characters to be interchangeable. What am I
>> missing?
>
> if that were the case, it wouldn't compile. Are you sure that it is
> treating it as a token?
>
> Ter
--
Rick
More information about the antlr-interest
mailing list