[antlr-interest] uh oh...trouble in meaning of (..)=> pred!!!

Dr. Kocher, Hartmut h.kocher at pharmatechnik.de
Thu Mar 15 01:15:25 PDT 2007

Hi Terence,

I discussed the issue with my colleague who has a very good compiler/parser background.

As you said, not evaluating the predicate is an optimization. Therefore, we think the correct solution is that (..)=> is evaluated (with backtracking if necessary). When I specify a predicate I know that I'm going to take a hit for this. So the normal solution is to always add the predicate. However, you could still do an optimization. If you expand your lookahead automaton, you could still decide early which alt to take. Only if this cannot easily be predicted in advance, you start adding the predicate code. (Actually, the optimization could be added later, say ANTLR 3.1, because it would only speed up things).

The current solution is less than optimal because it's faster, but not always correct (I could write an even faster lexer, if it doesn't have to be correct :-) ).

I think it's better to generate correct lexer/parsers, then to optimize them. Therefore, the rule should be
1) always add the predicate
2) leave it out if you can resolve it out correctly in advance.

The simple predicates in the example below could still be optimized. If you specify more complex predicates then one must be prepared to take the hit.

Hope this helps.

Kind regards
Dr. Hartmut Kocher & Dr. Harald Müller
Cortex Brainware

-----Ursprüngliche Nachricht-----
Von: Terence Parr [mailto:parrt at cs.usfca.edu] 
Gesendet: Donnerstag, 15. März 2007 00:06
An: ANTLR Interest
Cc: Dr. Kocher, Hartmut
Betreff: uh oh...trouble in meaning of (..)=> pred!!!

Hi.  Harmut submitted a bug report, which I have converted to a parser:

grammar T;

x : (a d) => a
   | (b d) => b
   | ('a'|'b')+

a       :       'a' 'a' ;
b       :       'a' 'a' 'b' ;
digit   :       '0'|'1' ;

Basically, in the book and in my intentions, predicates order the  
alts.  The problem is that ANTLR's analysis doesn't consider  
syntactic predicates if it can figure out what to do w/o them.   
That's an optimization.  The problem is that you are often specifying  
the lookahead in the predicate that must be evaluated.  Crap.  ANTLR  
is not forcing those predicates in there.

For semantic predicates, we have {...}? and {...}?=> where the latter  
forces backtracking.  Perhaps (...)=> should always force  
backtracking.  BUT, for backtracking=true, I add a predicate to every  
alt!  I guess for that backtracking mode, those predicates should be  
analogous to {...}? and manually specified (...)=> should operate  
like {..}?

I have exactly 4 days to resolve this issue before the book goes to  
copy editing.  Anybody wanna help me think about this?


Pharmatechnik GmbH & Co. KG
Münchner Straße 15
D-82319 Starnberg

Sitz der Gesellschaft: Starnberg
HRA: 64434, HRB: 66369, Amtsgericht München
Geschäftsführer: Dr. Detlef Graessner, Werner Torns, Stephan Jörgens

More information about the antlr-interest mailing list