[antlr-interest] Help with infix expression evaluator grammar

Thu Apr 9 08:36:58 PDT 2009

ANTLR folks,

I've been using a grammar for years to evaluate infix mathematical
expressions with function calls and things like that. It has performed
beautifully. I recently noticed that some (meaningless) junk at the end
of the expression was sometimes ignored.

For example, this expression should return 4:

2+2

It does return 4 (otherwise it would be a pretty lousy evaluator).
However, this expression /also/ returns 4:

2+2)

The evaluator is rigged to allow parens to group things so that (2+2)*2
yields 8 instead of 6 (which would be the case without the parens. It
appears that the trailing parens is being ignored which is nice in some
ways but not nice in others: you can end up with expressions whose
parens balance but are semantically incorrect, which is why I'm here.

Attached is the entire grammar (which uses ANTLR 2.7.3 to build and
execute). "expr" is the top-level production and also the function that
I call on my parser in order to obtain my own AST (as you can see from
the grammar).

At the end of the 'expr' production, you can see a comment (line 46)
that says "This (EOF!)? seems totally sketchy, but it works!".
Apparently, at the time I wrote the grammar, I knew something was amiss,
but wasn't sure how to fix it.

If I remove the ? from the (EOF!), making it required, my simple tests
(such as "2+2)") correctly fail to parse but other things (such as
"(2+2)") also fail to parse ("expecting EOF, found ')').

It seems like the most logical fix is to create a new top-level
production that just says:

 return=expr (EOF!)

...and remove the EOF entirely from my 'expr' production.

I'm pretty sure that would work, but I'm concerned that I may be missing
something more fundamental.

I'd appreciate some feedback on this issue, and I'd love it if anyone
had any comments on the grammar in general: if there are any obvious
goofs I've made or optimizations that would make sense.

At some point, I'd like to move-up to the later versions of ANTLR
(especially those that can direct-generate Java bytecode to parse these
things), so if there are any modifications I'd need to make to my
grammar for that purpose, I'd appreciate feeback, too.

Thanks,
-chris
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: expression.g
Url: http://www.antlr.org/pipermail/antlr-interest/attachments/20090409/06288d56/attachment.pl 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090409/06288d56/attachment.bin