[antlr-interest] Differences in Java and C# error handling
Maik Schmidt
contact at maik-schmidt.de
Wed Aug 10 00:06:54 PDT 2011
Hi!
At the moment I am reading "Language Implementation Patterns" and I am
translating the book's examples to C#. While doing so I've seen some
differences in error handling that I cannot explain. I'm using the following
grammar:
grammar Graphics;
file : command+ ; // a file is a list of commands
command : 'line' 'from' point 'to' point ;
point : INT ',' INT ; // E.g., "0,10"
INT : '0'..'9'+ ; // lexer rule to match 1-or-more digits
/** Skip whitespace */
WS : (' ' | '\t' | '\r' | '\n') { Skip(); } ;
I translate it using 'java org.antlr.Tool Graphics.g3' into
GraphicsLexer.java and GraphicsLexer.java and I use the following driver
code for controlling the parser:
public static void main(String[] args) throws Exception {
CharStream input = new ANTLRFileStream(args[0]);
GraphicsLexer lex = new GraphicsLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lex);
GraphicsParser p = new GraphicsParser(tokens);
p.file();
}
Also I have a file named invalid.txt that contains the following invalid
input sentence:
line to 2,3
When I pass this file to the driver code, the program outputs the following
message:
line 1:5 mismatched input 'to' expecting 'from'
This is exactly what I would expect and it's exactly the behavior Terrence
describes in his book.
Then I have translated the grammar to C#, so it looks like this:
grammar Graphics;
options {
language=CSharp3;
TokenLabelType=CommonToken;
output=AST;
ASTLabelType=CommonTree;
}
@parser::namespace { GraphicsTool }
@lexer::namespace { GraphicsTool }
public
file : command+ ; // a file is a list of commands
command : 'line' 'from' point 'to' point ;
point : INT ',' INT ; // E.g., "0,10"
INT : '0'..'9'+ ; // lexer rule to match 1-or-more digits
/** Skip whitespace */
WS : (' ' | '\t' | '\r' | '\n') { Skip(); } ;
Using the ANTLR extensions for Visual Studio 2010 I have created
GraphicsLexer.cs and GraphicsParser.cs. Then I use the following driver
code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Antlr.Runtime;
namespace GraphicsTool
{
class Program
{
static void Main(string[] args)
{
ANTLRStringStream input = new ANTLRFileStream(args[0]);
GraphicsLexer lex = new GraphicsLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lex);
GraphicsParser p = new GraphicsParser(tokens);
p.file();
}
}
}
When I invoke the program with invalid.txt it silently ignores all errors
and stops without a message.
I have added
@rulecatch {
catch (RecognitionException ex) {
System.Console.WriteLine(ex);
throw ex;
}
}
to the grammar, so at least the program outputs an error message:
MismatchedTokenException('to'!='from')
MismatchedTokenException('to'!='from')
Unhandled Exception: Antlr.Runtime.MismatchedTokenException: A recognition
error occurred.
at GraphicsTool.GraphicsParser.file() in
C:\Users\mschmidt2\Documents\VisualStudio
2010\Projects\GraphicsTool\GraphicsTool\obj\x86\Debug\GraphicsParser.cs:line
174
at GraphicsTool.Program.Main(String[] args) in
C:\Users\mschmidt2\Documents\Visual Studio
2010\Projects\GraphicsTool\GraphicsTool\Program.cs:line 20
Of course, this isn't as nice as the Java behavior and I am wondering what
is causing the different behavior.
Is it possible to use the Java parser's behavior in C#?
Cheers,
Maik
More information about the antlr-interest
mailing list