[antlr-interest] question on greedy matching

Andrei Alexandrescu andrei at metalanguage.com
Thu Oct 20 01:55:30 PDT 2005


Ric Klaren wrote:
> On 10/18/05, Andrei Alexandrescu <andrei at metalanguage.com> wrote:
>>try statement ( catch (ID ID) '{' statement '}' )+
>>
>>Clearly not an award-winning design, because it has the "dangling catch"
>>ambiguity:
>>
>>try
>>     try
>>         foo();
>>     catch (E1 e) { bar(); }
>>catch (E2 e) { baz(); }
> 
> 
> I assume the try statement is an alternative of the statement rule (or
> somewhere below it)?


That is correct. The try statement is an alt of statement.

> (Side note: This is awfully horrid language design.... adding curlies
> around the statement inside the try would make life a lot more
> pleasant.)

Definitely. But it's the most sensible example I could find for a 
situation where you want a non-greedy parsing. (Other examples?) In 
addition, it's an excellent exercise for understanding how to do some 
nontrivial parsing with antlr.

Besides, think o the even more horrid rule:

statement ::= ...
   | try statement ( catch (ID ID) statement )+

That is, eliminate curlies from the catch as well. Now you need an awful 
lot of disambiguation, because how are you going to interpret:

try
      try
          foo();
      catch (E1 e) try bar(); catch (E3 e) zip();
catch (E2 e) baz();

> I guess the following is not allowed?
> 
> try
>      try
>          foo();
>      catch (E1 e) { bar(); }
>      catch (E2 e) { baz(); }
>      bbb();
> catch (E3 e) { baa(); }

You can't do that, you'd have to add curlies:

try {
      try
          foo();
      catch (E1 e) { bar(); }
      catch (E2 e) { baz(); }
      bbb();
}
catch (E3 e) { baa(); }

In other words, it's a Java-like language, not a Python-like language 
(indentation is irrelevant).

> No predicates is probably not possible without switching to
> treebuilding and fixing things in a 2nd pass.

That's what I thought.

> Not 100% sure but the dirty trick department might work:
> 
> try_rule { ncatch = 0; }:
> try { this.ntry++; } statement
> (  { if( ntry > 1 && ncatch > 1 ) break; }:
>    catch { ncatch++; }  (ID ID) '{' statement '}'
> )+ { this.ntry--; }

Ok, I understand.

> Alternatively:
> 
> start out with a try rule with only one catch. Then define an extra
> alternative (probably in the statement rule to collect trailing catch
> statements) Some checks are probably necessary afterwards to see if
> everything made sense. The catch all might need a predicate to prevent
> it from being entered if the preceding statement was not a try.

I believe this is not possible with syntactic predicates. You have to 
have semantic predicates, I think. Actually that's my fundamental 
question. Another fundamental question is how would a PEG deal with that 
(see http://pdos.csail.mit.edu/~baford/packrat/).

> Hope this helps (and that it was not too late in the evening when I
> wrote this ;) ),

Thanks, Ric.


Andrei


More information about the antlr-interest mailing list