[antlr-interest] question on greedy matching

Thu Oct 20 00:50:22 PDT 2005

Hi,

On 10/18/05, Andrei Alexandrescu <andrei at metalanguage.com> wrote:
> I have a question on how to solve a particular greedy/non-greedy
> matching with ANTLR.
>
> Consider the following language design for a sort of a try statement:
>
> try statement ( catch (ID ID) '{' statement '}' )+
>
> Clearly not an award-winning design, because it has the "dangling catch"
> ambiguity:
>
> try
>      try
>          foo();
>      catch (E1 e) { bar(); }
> catch (E2 e) { baz(); }

I assume the try statement is an alternative of the statement rule (or
somewhere below it)?

> There's a need for a meta-rule to decide where the last catch comes. So
> here it is:
>
> "One catch always binds to the closest try statement. The others always
> bind to the outermost try statement possible."
>
> I'm not sure I was clear enough, but the intent of the rule above binds
> the first catch in the example to the inner try, and the second catch to
> the outer try, as the indentation suggests.
>
> Moreover, in the example:
>
> try
>      try
>          foo();
>      catch (E1 e) { bar(); }
> catch (E2 e) { baz(); }
> catch (E3 e) { baa(); }
>
> again the rule makes the statements bind as the indentation suggests.

(Side note: This is awfully horrid language design.... adding curlies
around the statement inside the try would make life a lot more
pleasant.)

I guess the following is not allowed?

try
     try
         foo();
     catch (E1 e) { bar(); }
     catch (E2 e) { baz(); }
     bbb();
catch (E3 e) { baa(); }

> So, the question is, what would be the cleanest way to express that in
> ANTLR (preferrably without semantic predicates)? If I understand
> correctly, it's not an issue of setting the greedy option, because the
> same try construct must be parsed greedily or not, depending on whether
> the parent is a try statement as well.

No predicates is probably not possible without switching to
treebuilding and fixing things in a 2nd pass.

Not 100% sure but the dirty trick department might work:

try_rule { ncatch = 0; }:
try { this.ntry++; } statement
(  { if( ntry > 1 && ncatch > 1 ) break; }:
   catch { ncatch++; }  (ID ID) '{' statement '}'
)+ { this.ntry--; }

Alternatively:

start out with a try rule with only one catch. Then define an extra
alternative (probably in the statement rule to collect trailing catch
statements) Some checks are probably necessary afterwards to see if
everything made sense. The catch all might need a predicate to prevent
it from being entered if the preceding statement was not a try.

Going multipass is probably the cleanest.

Hope this helps (and that it was not too late in the evening when I
wrote this ;) ),

Ric