[antlr-interest] uh oh...trouble in meaning of (..)=> pred!!!

Thu Mar 15 11:06:30 PDT 2007

On Mar 15, 2007, at 1:15 AM, Dr. Kocher, Hartmut wrote:

> Hi Terence,
>
> I discussed the issue with my colleague who has a very good  
> compiler/parser background.
>
> As you said, not evaluating the predicate is an optimization.  
> Therefore, we think the correct solution is that (..)=> is  
> evaluated (with backtracking if necessary). When I specify a  
> predicate I know that I'm going to take a hit for this. So the  
> normal solution is to always add the predicate. However, you could  
> still do an optimization. If you expand your lookahead automaton,  
> you could still decide early which alt to take. Only if this cannot  
> easily be predicted in advance, you start adding the predicate  
> code. (Actually, the optimization could be added later, say ANTLR  
> 3.1, because it would only speed up things).

With the new gated predicate version of (...)=>, the pred is always  
evaluated because that is what you said.  I cannot  know whether you  
have a pred that matches a longer string than the alt like:

A : ('a' DIGIT)=> 'a'
    | 'a'
    | 'b'
    ;

For this, 'a9' must match first alt and 'a' must match 2nd alt.  No  
pred is evaluated for 'b' though.

> The current solution is less than optimal because it's faster, but  
> not always correct (I could write an even faster lexer, if it  
> doesn't have to be correct :-) ).

ha hah hah!

> I think it's better to generate correct lexer/parsers, then to  
> optimize them. Therefore, the rule should be
> 1) always add the predicate

yup.

> 2) leave it out if you can resolve it out correctly in advance.

that could be optimized out later if we can know.  Might be  
undecidable; figuring out if L(G1)==L(G2) is anyway if I remember.

Thanks!
Ter