[antlr-interest] Two More Bugs in ANTLRv4

Jan Finis finis at in.tum.de
Wed Feb 29 09:12:33 PST 2012


The funny thing is that Honey Badger DID support this kind of grammar 
(see my earlier mail with a running example). Maybe due to a bug, but 
that shows that it must be somehow possible in a top-down manner.

On 02/29/2012 06:00 PM, Ron Burk wrote:
>>   It's basically saying that
>> one possible form of expr is two exprs (which in turn can each be
>> another two exprs).  With no other distinguishing tokens there's
>> no way to exit that path.  So that way lies infinite recursion,
>> madness, and/or Sparta.
> Hmm, looks like a perfectly fine grammar to me.
> Not one a top-down parser would like, of course,
> but no problem for bottom-up algorithms. expr
> can always produce 'x', so there's nothing "infinite"
> other than the normal ability of a grammar to produce
> an infinite number of strings (it's not non-terminating).
>
>> Was this from a real usage attempt (if so, please explain more
>> what you were trying), or were you just experimenting?
> Awk introduced implicit concatenation about 35
> years ago. Not sure if there are any earlier examples of
> the implicit binary operator.
>
> This is a one-off case, but a well-established one.
> A parser generator that truly supported expressions
> in their "natural" language should not only allow
> the implicit operator, but allow the user to specify
> its precedence and associativity, IMHO.
>
> This is certainly doable bottom-up. One knows when an
> expr has been seen and that the next token could start
> an expr and can therefore "pretend" that the implicit
> operator exists. Haven't thought about doing it top-down. I think
> the harder problem is intelligent guidance so the user doesn't
> make a mess. For example, it's problematic to try to add
> this operator to C, and the parser generator has to know
> quite a lot about expressions (apart from grammars) in order
> to give an intelligent explanation of why that is (can't use
> implicit binary if an outfix operator is overloaded to be a
> postunary operator as well, as is the case in C with
> '(X')' and 'X(X)').
>
> How does Awk do it then, since it also uses parens
> for both grouping and function call? Unlike C, Awk
> doesn't allow a function "name" to be an expression.
> (Awk source code at google: The One True Awk).
> More complications! (Although, complications like
> that particular one get unified, I think, if you use LR(0)
> and then only examine inadequate states to see what
> problems are left over for precedence/associativity
> to attempt to resolve).
>
> This is exactly why parser generators don't "really"
> support expressions. To really go into that domain
> requires embedding a lot of extra knowledge that's
> pretty hard to get right, since it's not on a solid
> mathematical foundation to start with. If you
> don't try to jump in with both feet and handle
> all the messes, then you only "sorta" support
> expressions, like Yacc/Bison, and have to
> (like them) leave a note mentioning that the user's
> probably going to get into trouble if s/he strays outside
> a few well-established usages.
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list