[antlr-interest] How to recognize unmatchable input?

Marco Trudel marco at mtsystems.ch
Tue Jan 4 06:06:12 PST 2011


Dear all

##### grammar #####

grammar Demo;
main : ONE* ;
ONE  : '1' {System.out.print("(1) ");} ;
A    : 'a' {System.out.print("(a) ");} ;
WS   : ' ' {$channel=HIDDEN;} ;

##### code #####

public static void main(String[] args) throws Exception {
    DemoLexer lexer = new DemoLexer(new ANTLRStringStream("a 1 1"));
    DemoParser parser = new DemoParser(new CommonTokenStream(lexer));
    parser.main();
    System.out.println("Lexer: " + lexer.getNumberOfSyntaxErrors());
    System.out.println("Parser: " + parser.getNumberOfSyntaxErrors());
}


Working with antlr-3.3-complete.jar and libantlr3c-3.3-SNAPSHOT.tar.gz, 
for the input "a 1 1" I get:
- Java target: (a), Lexer: 0, Parser: 0
- C target: (1) (1), Lexer: 0, Parser: 0

Am I doing something undefined here? I'm surprised that the two targets 
produce a different result. I would expect an error since the input 
seems unmatchable to me.

If I change "main" to
	main : ONE* EOF ;
I get:
- Java target: (a) (1), Lexer: 0, Parser: 1
   -> With the warning: line 1:0 missing EOF at 'a'
- C target: (1) (1), Lexer: 0, Parser: 0


Questions:
- How do I recognize if an input did not match my grammar?
- Which of the targets is doing it right? None, only one or both?


In my real project I have something very similar but with completely 
different behavior. The Java target tells me "no viable alternative at 
input" and gives me a parser error. The C target just segfaults :-/
So I'm really interested how to do that right.


Thanks
Marco


More information about the antlr-interest mailing list