[antlr-interest] Differences in Java and C# error handling

Maik Schmidt contact at maik-schmidt.de
Wed Aug 10 02:42:52 PDT 2011


Hi!

At the moment I am reading "Language Implementation Patterns" and I am
translating the book's examples to C#. While doing so I've seen some
differences in error handling that I cannot explain. I'm using the following
grammar:

grammar Graphics;

file : command+ ; // a file is a list of commands

command : 'line' 'from' point 'to' point ;

point : INT ',' INT ; // E.g., "0,10"

INT : '0'..'9'+ ; // lexer rule to match 1-or-more digits

/** Skip whitespace */
WS : (' ' | '\t' | '\r' | '\n') { Skip(); } ;

I translate it using 'java org.antlr.Tool Graphics.g3' into
GraphicsLexer.java and GraphicsLexer.java and I use the following driver
code for controlling the parser:

public static void main(String[] args) throws Exception {
  CharStream input = new ANTLRFileStream(args[0]);
  GraphicsLexer lex = new GraphicsLexer(input);
  CommonTokenStream tokens = new CommonTokenStream(lex);
  GraphicsParser p = new GraphicsParser(tokens);
  p.file();
}

Also I have a file named invalid.txt that contains the following invalid
input sentence:

line to 2,3

When I pass this file to the driver code, the program outputs the following
message:

line 1:5 mismatched input 'to' expecting 'from'

This is exactly what I would expect and it's exactly the behavior Terrence
describes in his book.

Then I have translated the grammar to C#, so it looks like this:

grammar Graphics;

options {
    language=CSharp3;
    TokenLabelType=CommonToken;
    output=AST;
    ASTLabelType=CommonTree;
}

@parser::namespace { GraphicsTool }
@lexer::namespace { GraphicsTool }

public

file : command+ ; // a file is a list of commands

command : 'line' 'from' point 'to' point ;

point : INT ',' INT ; // E.g., "0,10"

INT : '0'..'9'+ ; // lexer rule to match 1-or-more digits

/** Skip whitespace */
WS : (' ' | '\t' | '\r' | '\n') { Skip(); } ;

Using the ANTLR extensions for Visual Studio 2010 I have created
GraphicsLexer.cs and GraphicsParser.cs. Then I use the following driver
code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Antlr.Runtime;

namespace GraphicsTool
{
    class Program
    {
        static void Main(string[] args)
        {
            ANTLRStringStream input = new ANTLRFileStream(args[0]);
            GraphicsLexer lex = new GraphicsLexer(input);
            CommonTokenStream tokens = new CommonTokenStream(lex);
            GraphicsParser p = new GraphicsParser(tokens);
            p.file();
        }
    }
}

When I invoke the program with invalid.txt it silently ignores all errors
and stops without a message.

I have added

@rulecatch {
    catch (RecognitionException ex) {
     System.Console.WriteLine(ex);
        throw ex;
    }
}

to the grammar, so at least the program outputs an error message:

MismatchedTokenException('to'!='from')
MismatchedTokenException('to'!='from')

Unhandled Exception: Antlr.Runtime.MismatchedTokenException: A recognition
error occurred.
at GraphicsTool.GraphicsParser.file() in
C:\Users\mschmidt2\Documents\VisualStudio
2010\Projects\GraphicsTool\GraphicsTool\obj\x86\Debug\GraphicsParser.cs:line
174
at GraphicsTool.Program.Main(String[] args) in
C:\Users\mschmidt2\Documents\Visual Studio
2010\Projects\GraphicsTool\GraphicsTool\Program.cs:line 20

Of course, this isn't as nice as the Java behavior and I am wondering what
is causing the different behavior.

Is it possible to use the Java parser's behavior in C#?

Cheers,
Maik


More information about the antlr-interest mailing list