[antlr-interest] Java grammar accepting junk

Fri Aug 15 15:13:03 PDT 2008

On Aug 15, 2008, at 11:23 PM, Terence Parr wrote:

> Well, take "succeeded" not very seriously...it just prints
>
>                        System.out.println("finished parsing OK");
>
> if there was no exception... all of the recognition exceptions are  
> caught inside the parser.  You could check the parser for the number  
> of errors.

FWIW, I'm currently working on a more flexible ErrorListener design.  
Partly because we discussed it recently, partly because I need it now ;)
The basic idea is to get passed the internal exceptions (which are  
caught inside the parser) in a listener so you can more easily do  
error reporting and checking.

Currently, to see those exceptions, you can either throw exceptions  
yourself within the @rulecatch action (rethrow if that's what you  
want) or make a superclass and override the error methods (one or more  
of them, depending on what you need to do). Since you can't mixin  
things in Java this gets messy if you want to put other stuff in that  
superclass, too (like all members as opposed to write them into  
@members{}).
In my current approach you can set an error listener (should probably  
allow more than one, but one step at a time) on the recognizer and  
then you can intercept the calls. With a little trickery you can also  
just forward them to BaseRecognizer again, so you can just check for  
errors. I do that in the default implementation, but I have one that  
queues everything up to inspect later. I use the latter in my test  
suite. Nice thing is, you don't need to modify your grammar at all.

If you just need to know if there were any errors, you can add a bit  
of code to your grammar and call it after parsing:

@parser::members{
public boolean seenErrors() {
	return this.state.syntaxErrors > 0;
}
}

I think state is protected, so you need to add code to see it.

cheers,
-k

> On Aug 15, 2008, at 2:04 PM, Ron Hunter-Duvar wrote:
>
>> Hi,
>>
>> I'm doing some Java parsing with Antlr 3.1 and the Java.g grammar  
>> from Antlr.org. When I pass it non-Java input (e.g. shell scripts),  
>> it complains a lot, but still acts as if the parsing succeeded. I  
>> noticed that the grammar didn't have an EOF token to force it to go  
>> to end of file, so I added a new top level rule:
>>
>> sourceFile
>> : compilationUnit EOF
>> ;
>>
>> and invoked it with that new target. Seemed simple enough. But it  
>> didn't help. The parser still happily accepts garbage:
>>
>> Parsing: test.sh
>> line 1:0 no viable alternative at character '#'
>> line 5:0 no viable alternative at character '#'
>> line 5:1 no viable alternative at character '#'
>> line 5:2 no viable alternative at character '#'
>> line 1:1 no viable alternative at input '!'
>>  Succeeded
>>
>> The first and last line of output are from my driver code.  
>> Basically I was expecting the parser to throw an exception, which  
>> would have counted as a failure. Since it didn't, it counts it as a  
>> success.
>>
>> Maybe I'm not understanding how error reporting works in Antlr 3.1.  
>> I've worked quite a bit with Antlr 2.7, but I'm new to Antlr 3. I  
>> don't have the book, and haven't found anything in the wiki that  
>> explains this. Perhaps someone can enlighten me?
>>
>> Thanks,
>> Ron
>>
>> -- 
>> Ron Hunter-Duvar | Software Developer V | 403-272-6580
>> Oracle Service Engineering
>> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
>>
>> All opinions expressed here are mine, and do not necessarily  
>> represent
>> those of my employer.
>>
>

-- 
Kay Röpke
http://classdump.org/