[antlr-interest] Strange parsing behavior

Tue Apr 7 09:04:40 PDT 2009

I am working with an ANTLR grammar for a custom language and have
encountered a strange parsing issue. Here is a highly simplified grammar for
the issue I've found:
grammar Test;
@members {
    public static void main(String[] args) throws Exception {
        TestLexer lex = new TestLexer(new ANTLRFileStream(args[0]));
        CommonTokenStream tokens = new CommonTokenStream(lex);
        TestParser parser = new TestParser(tokens);
        try {
            parser.prog();
        } catch (RecognitionException e) {
            e.printStackTrace();
        }
    }
}

prog : b=ID '{' s=ID ';' '}' { System.out.println("Found block: " +
$b.text);} ;
ID : ('A'..'Z' | 'a'..'z') ('A'..'Z' | 'a'..'z' | '0'..'9' | '_')* ;
WS : (' '|'\r'|'\t'|'\u000C'|'\n'|'\u0000') {$channel=HIDDEN;} ;

If I give it an input of:

foo { a; }; bar { b;}

If it displays:

Found block: foo

It does not flag any errors and all blocks following the semicolon are
ignored. It works correctly without the semicolon (the normal case) with
both block names displayed but it should at least flag some kind of error if
the semicolon is there. I've tried this with ANTLR v3.1.1 and v3.1.3 with
both Java and C targets and all behave the same. Does anyone know what is
going on?

Thanks!

- Dan -
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090407/55560a41/attachment.html