[antlr-interest] Re: a new paper on ANTLR style grammars

Fri Nov 21 00:46:25 PST 2003

Hi there,

Mabey this is obvious, but if the parser-generator could identify such
cases, i.e. 

a: A A { bSomething == true; } A { if(bSomething) ...; } A B;
b: A A { bSomething == false; } A { if(bSomething) ...; } A C;

and it can hadle syntactic predicates, it would automatically change
this to

a: (A A A A B)=> A A { bSomething == true; } A { if(bSomething) ..; } A
B;
b: (A A A A C)=> A A { bSomething == false; } A { if(bSomething) ..; }
A C;

do the job? It would obviously hurt performance though, but such cases
should be rare.

Arnar

> -----Original Message-----
> From: Thomas Brandon [mailto:tom at psy.unsw.edu.au] 
> Sent: 21. nóvember 2003 06:51
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] Re: a new paper on ANTLR style grammars
> 
> 
> Oliver,
> I don't think doing all actions and then rolling back is going to be 
> enough. Remember that by itself the example you present is no 
> problem for LR, just use:
> ab: A A A A (B { Action1(); } | C { Action2(); });
> of course I assume you intended something like:
> a: A A { bSomething == true; } A { if(bSomething) ...; } A B;
> b: A A { bSomething == false; } A { if(bSomething) ...; } A C;
> i.e. we must decide before we see the B|C, as actions are not edge-
> aligned. (Though presumably we are doing more than simply setting 
> flags).
> 
> But in this case how are transactions going to work? We execute both 
> bSomething = true and bSomething = false? No, I think we're going to 
> need to branch the entire state, in one we do bSomething = true; in 
> the other bSomething = false;. Now were either gonna need to do two 
> complete parses of the rest of the input or parse once but for every 
> subsequent action execute it against all valid states. Of course 
> once the path we take branches then were going to need to branch our 
> states for each path too.
> 
> If instead our grammar was:
> a: A A { bSomething == true; } A { if(bSomething) ...; else if 
> bSomethingElse) ...;} A B;
> b: A A { bSomethingElse == true; } A { if(bSomething) ...; else if
> (bSomethingElse) ...;} A C;
> where there is no overlap between the two actions, then we wouldn't 
> need to. But I don't think the semantic analysis necessary to 
> distinguish those cases is trivial (given ithings like the 
> possibility that bSomething == bSomethingElse). Similarly 
> transforming to the benign form may be possible but would require 
> such analysis. And even if you do manage the analysis it means you 
> need to redo the entire analysis when anything changes. i.e. if my 
> action is myClass.MyAction() then we need to redo the entire 
> analysis if myClass.MyAction changes, hence no inheritance without 
> possibly breaking things.
> 
> And what about when actions overlap, i.e. we have a choice between 
> an action { a = 3; b++; } and { a = 4; b++; c++; }, we can't just 
> store the inital state and later restore it because if we restore 
> that we undo both actions. Somehow we need to figure out for our 
> actions A and B, A union B (those changes common to both) and A - B 
> and B - A, those changes peculiar to B and those peculiar to A. 
> Then, if we want A to go through we only rollback B - A (those 
> changes peculiar to B). Again requiring (probably quite 
> sophisticated) static analysis.
> 
> Tom.
> --- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" 
> <oliver at z...> wrote:
> > What I wanted to say was: If you have sematic actions associated to
> > your grammar that can be inserted (and of course executed) at any
> > point and you have a table driven approach you are in trouble. 
> This is
> > because what I understand as the precomputation of a search tree 
> into
> > a table combining certain *search* states. Extrapolating from what 
> I
> > know about LR you have a problem when youe have a grammar like 
> this:
> > 
> > a : A A { do something here } A A B ;
> > b : A A { do something different here } A A C ;
> > 
> > upon input
> > 
> > AAAAC
> > 
> > This is because the parser has no idea which action to execute 
> here.
> > Now, my idea was to execute both and roll back the action of rule 
> a as
> > soon as it is clear that rule b actually matches.
> > 
> > A bit clearer now?
> > 
> > Oliver
> > 
> > --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> 
> wrote:
> > > 
> > > On Wednesday, November 19, 2003, at 03:12 PM, Oliver Zeigermann 
> wrote:
> > > 
> > > > Actually made it through the paper while getting nervous with 
> the
> > > > proofs ;)
> > > >
> > > > While he has linear time "backtracking" performance, ANTLR is 
> worst
> > > > case exponential. I was wondering why: ANTLR does not combine 
> its
> > > > depth first search (aka backtracking in this context) into a 
> table
> > > > while Bryan's approach does (at least I understand it this 
> way). The
> > > > problem Bryan will come across (given my understanding is 
> halfway
> > > > correct) is ACTIONS. As with LR and combined states, the 
> problem is
> > > > when to execute associated semantic actions. The drawback is 
> well
> > > > known and and leads to reduction in parsing power.
> > > >
> > > > Might sound weird, but I thought if we still combined states 
> even
> > > > though they are associated with different actions and simple 
> execute
> > > > all actions, there would be no loss of power :) Silly? Not if 
> you have
> > > > a transactional language that allows you to roll back actions 
> that
> > > > turn out to be invalid later and roll forward the valid ones.
> > > >
> > > > Technically this is possible. Does it make sense as well? Am I 
> slowly
> > > > going crazy? ;)
> > > 
> > > You are already crazy like me ;) <snicker, snort>.  Just got 
> mail from 
> > > him. :)  Hope it's ok to repeat part here:
> > > 
> > > > - Packrat parsing guarantees linear-time parsing on all the 
> types of 
> > > > grammars
> > > > it supports, which amounts to everything that fits the 
> formalism or
> > > > "conceptual model" of parsing expression grammars.  But 
> the "pure"
> > PEG 
> > > > model
> > > > doesn't directly support "stateful" grammars like those of C 
> and C++, 
> > > > in
> > > > which you have to build up symbol tables and such that 
> effectively 
> > > > modify the
> > > > grammar mid-stream as the parser scans the input from left to 
> right.  
> > > > From
> > > > what I've seen so far, it appears fundamentally difficult or 
> > > > impossible to
> > > > make a packrat parser support stateful grammars efficiently 
> without
> > > > effectively turning it into a deterministic (e.g., LR) parser.
> > > 
> > > So, the actions are the problem for everyone :)
> > > 
> > > Ter
> > > --
> > > Professor Comp. Sci., University of San Francisco
> > > Creator, ANTLR Parser Generator, http://www.antlr.org
> > > Co-founder, http://www.jguru.com
> > > Co-founder, http://www.knowspam.net enjoy email again!
> > > Co-founder, http://www.peerscope.com pure link sharing
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/ 
> 
> 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/