[antlr-interest] Re: a new paper on ANTLR style grammars

Thomas Brandon tom at psy.unsw.edu.au
Thu Nov 20 22:50:45 PST 2003


Oliver,
I don't think doing all actions and then rolling back is going to be 
enough. Remember that by itself the example you present is no 
problem for LR, just use:
ab: A A A A (B { Action1(); } | C { Action2(); });
of course I assume you intended something like:
a: A A { bSomething == true; } A { if(bSomething) ...; } A B;
b: A A { bSomething == false; } A { if(bSomething) ...; } A C;
i.e. we must decide before we see the B|C, as actions are not edge-
aligned. (Though presumably we are doing more than simply setting 
flags).

But in this case how are transactions going to work? We execute both 
bSomething = true and bSomething = false? No, I think we're going to 
need to branch the entire state, in one we do bSomething = true; in 
the other bSomething = false;. Now were either gonna need to do two 
complete parses of the rest of the input or parse once but for every 
subsequent action execute it against all valid states. Of course 
once the path we take branches then were going to need to branch our 
states for each path too.

If instead our grammar was:
a: A A { bSomething == true; } A { if(bSomething) ...; else if 
bSomethingElse) ...;} A B;
b: A A { bSomethingElse == true; } A { if(bSomething) ...; else if
(bSomethingElse) ...;} A C;
where there is no overlap between the two actions, then we wouldn't 
need to. But I don't think the semantic analysis necessary to 
distinguish those cases is trivial (given ithings like the 
possibility that bSomething == bSomethingElse). Similarly 
transforming to the benign form may be possible but would require 
such analysis. And even if you do manage the analysis it means you 
need to redo the entire analysis when anything changes. i.e. if my 
action is myClass.MyAction() then we need to redo the entire 
analysis if myClass.MyAction changes, hence no inheritance without 
possibly breaking things.

And what about when actions overlap, i.e. we have a choice between 
an action { a = 3; b++; } and { a = 4; b++; c++; }, we can't just 
store the inital state and later restore it because if we restore 
that we undo both actions. Somehow we need to figure out for our 
actions A and B, A union B (those changes common to both) and A - B 
and B - A, those changes peculiar to B and those peculiar to A. 
Then, if we want A to go through we only rollback B - A (those 
changes peculiar to B). Again requiring (probably quite 
sophisticated) static analysis.

Tom.
--- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" 
<oliver at z...> wrote:
> What I wanted to say was: If you have sematic actions associated to
> your grammar that can be inserted (and of course executed) at any
> point and you have a table driven approach you are in trouble. 
This is
> because what I understand as the precomputation of a search tree 
into
> a table combining certain *search* states. Extrapolating from what 
I
> know about LR you have a problem when youe have a grammar like 
this:
> 
> a : A A { do something here } A A B ;
> b : A A { do something different here } A A C ;
> 
> upon input
> 
> AAAAC
> 
> This is because the parser has no idea which action to execute 
here.
> Now, my idea was to execute both and roll back the action of rule 
a as
> soon as it is clear that rule b actually matches.
> 
> A bit clearer now?
> 
> Oliver
> 
> --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> 
wrote:
> > 
> > On Wednesday, November 19, 2003, at 03:12 PM, Oliver Zeigermann 
wrote:
> > 
> > > Actually made it through the paper while getting nervous with 
the
> > > proofs ;)
> > >
> > > While he has linear time "backtracking" performance, ANTLR is 
worst
> > > case exponential. I was wondering why: ANTLR does not combine 
its
> > > depth first search (aka backtracking in this context) into a 
table
> > > while Bryan's approach does (at least I understand it this 
way). The
> > > problem Bryan will come across (given my understanding is 
halfway
> > > correct) is ACTIONS. As with LR and combined states, the 
problem is
> > > when to execute associated semantic actions. The drawback is 
well
> > > known and and leads to reduction in parsing power.
> > >
> > > Might sound weird, but I thought if we still combined states 
even
> > > though they are associated with different actions and simple 
execute
> > > all actions, there would be no loss of power :) Silly? Not if 
you have
> > > a transactional language that allows you to roll back actions 
that
> > > turn out to be invalid later and roll forward the valid ones.
> > >
> > > Technically this is possible. Does it make sense as well? Am I 
slowly
> > > going crazy? ;)
> > 
> > You are already crazy like me ;) <snicker, snort>.  Just got 
mail from 
> > him. :)  Hope it's ok to repeat part here:
> > 
> > > - Packrat parsing guarantees linear-time parsing on all the 
types of 
> > > grammars
> > > it supports, which amounts to everything that fits the 
formalism or
> > > "conceptual model" of parsing expression grammars.  But 
the "pure"
> PEG 
> > > model
> > > doesn't directly support "stateful" grammars like those of C 
and C++, 
> > > in
> > > which you have to build up symbol tables and such that 
effectively 
> > > modify the
> > > grammar mid-stream as the parser scans the input from left to 
right.  
> > > From
> > > what I've seen so far, it appears fundamentally difficult or 
> > > impossible to
> > > make a packrat parser support stateful grammars efficiently 
without
> > > effectively turning it into a deterministic (e.g., LR) parser.
> > 
> > So, the actions are the problem for everyone :)
> > 
> > Ter
> > --
> > Professor Comp. Sci., University of San Francisco
> > Creator, ANTLR Parser Generator, http://www.antlr.org
> > Co-founder, http://www.jguru.com
> > Co-founder, http://www.knowspam.net enjoy email again!
> > Co-founder, http://www.peerscope.com pure link sharing


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list