[antlr-interest] errors reported in console

Gavin Lambert antlr at mirality.co.nz
Tue Mar 17 04:44:22 PDT 2009


At 23:56 17/03/2009, Java Developer wrote:
>My example grammar file is shown below (end). I am using 
>ANTLRWorks 1.2.3. When I enter "bab be baj" in the interpreter, I 
>see the parse tree correctly. However, I see also that there are 
>errors reported in the console. When I look at the console, I see 
>the output below. What does this mean? Is this a problem with my 
>grammar?
>
>[06:41:35] error(208): test.g:24:1: The following token 
>definitions can never be matched because prior tokens match the 
>same input: SECOND
[...]
>CODE:    (FIRST)? SECOND (THIRD)?;
>
>FIRST:
>     ('b');
>SECOND:
>     ('a'|'e'|'i'|'o'|'u');
>THIRD:
>     ('b'|'j'|'v'|'g'|'s'|'m'|'d');

The problem is that all of these rules are top-level rules, and 
thus considered equal candidates for token generation.  Since CODE 
completely contains SECOND, there is no possible way for a SECOND 
token to be generated -- it will always come out as a CODE instead 
(since in the event of a tie for longest-match, rules listed first 
win).

To get the behaviour you probably intended, you need to change 
FIRST, SECOND, and THIRD to be fragment rules instead of top-level 
rules.

(If you're new to ANTLR: another key point you might be missing is 
that lexing stands alone.  The lexer runs in an initial pass by 
itself without any influence from parser rules.)

>[06:41:35] Interpreting...

Avoid using the interpreter.  It's ok on very simple grammars but 
once you get past trivial complexity it will mislead you.  Write 
proper unit tests instead, and use ANTLRWorks' debug mode.



More information about the antlr-interest mailing list