[antlr-interest] Re: a new paper on ANTLR style grammars
Thomas Brandon
tom at psy.unsw.edu.au
Thu Nov 20 22:50:45 PST 2003
Oliver,
I don't think doing all actions and then rolling back is going to be
enough. Remember that by itself the example you present is no
problem for LR, just use:
ab: A A A A (B { Action1(); } | C { Action2(); });
of course I assume you intended something like:
a: A A { bSomething == true; } A { if(bSomething) ...; } A B;
b: A A { bSomething == false; } A { if(bSomething) ...; } A C;
i.e. we must decide before we see the B|C, as actions are not edge-
aligned. (Though presumably we are doing more than simply setting
flags).
But in this case how are transactions going to work? We execute both
bSomething = true and bSomething = false? No, I think we're going to
need to branch the entire state, in one we do bSomething = true; in
the other bSomething = false;. Now were either gonna need to do two
complete parses of the rest of the input or parse once but for every
subsequent action execute it against all valid states. Of course
once the path we take branches then were going to need to branch our
states for each path too.
If instead our grammar was:
a: A A { bSomething == true; } A { if(bSomething) ...; else if
bSomethingElse) ...;} A B;
b: A A { bSomethingElse == true; } A { if(bSomething) ...; else if
(bSomethingElse) ...;} A C;
where there is no overlap between the two actions, then we wouldn't
need to. But I don't think the semantic analysis necessary to
distinguish those cases is trivial (given ithings like the
possibility that bSomething == bSomethingElse). Similarly
transforming to the benign form may be possible but would require
such analysis. And even if you do manage the analysis it means you
need to redo the entire analysis when anything changes. i.e. if my
action is myClass.MyAction() then we need to redo the entire
analysis if myClass.MyAction changes, hence no inheritance without
possibly breaking things.
And what about when actions overlap, i.e. we have a choice between
an action { a = 3; b++; } and { a = 4; b++; c++; }, we can't just
store the inital state and later restore it because if we restore
that we undo both actions. Somehow we need to figure out for our
actions A and B, A union B (those changes common to both) and A - B
and B - A, those changes peculiar to B and those peculiar to A.
Then, if we want A to go through we only rollback B - A (those
changes peculiar to B). Again requiring (probably quite
sophisticated) static analysis.
Tom.
--- In antlr-interest at yahoogroups.com, "Oliver Zeigermann"
<oliver at z...> wrote:
> What I wanted to say was: If you have sematic actions associated to
> your grammar that can be inserted (and of course executed) at any
> point and you have a table driven approach you are in trouble.
This is
> because what I understand as the precomputation of a search tree
into
> a table combining certain *search* states. Extrapolating from what
I
> know about LR you have a problem when youe have a grammar like
this:
>
> a : A A { do something here } A A B ;
> b : A A { do something different here } A A C ;
>
> upon input
>
> AAAAC
>
> This is because the parser has no idea which action to execute
here.
> Now, my idea was to execute both and roll back the action of rule
a as
> soon as it is clear that rule b actually matches.
>
> A bit clearer now?
>
> Oliver
>
> --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...>
wrote:
> >
> > On Wednesday, November 19, 2003, at 03:12 PM, Oliver Zeigermann
wrote:
> >
> > > Actually made it through the paper while getting nervous with
the
> > > proofs ;)
> > >
> > > While he has linear time "backtracking" performance, ANTLR is
worst
> > > case exponential. I was wondering why: ANTLR does not combine
its
> > > depth first search (aka backtracking in this context) into a
table
> > > while Bryan's approach does (at least I understand it this
way). The
> > > problem Bryan will come across (given my understanding is
halfway
> > > correct) is ACTIONS. As with LR and combined states, the
problem is
> > > when to execute associated semantic actions. The drawback is
well
> > > known and and leads to reduction in parsing power.
> > >
> > > Might sound weird, but I thought if we still combined states
even
> > > though they are associated with different actions and simple
execute
> > > all actions, there would be no loss of power :) Silly? Not if
you have
> > > a transactional language that allows you to roll back actions
that
> > > turn out to be invalid later and roll forward the valid ones.
> > >
> > > Technically this is possible. Does it make sense as well? Am I
slowly
> > > going crazy? ;)
> >
> > You are already crazy like me ;) <snicker, snort>. Just got
mail from
> > him. :) Hope it's ok to repeat part here:
> >
> > > - Packrat parsing guarantees linear-time parsing on all the
types of
> > > grammars
> > > it supports, which amounts to everything that fits the
formalism or
> > > "conceptual model" of parsing expression grammars. But
the "pure"
> PEG
> > > model
> > > doesn't directly support "stateful" grammars like those of C
and C++,
> > > in
> > > which you have to build up symbol tables and such that
effectively
> > > modify the
> > > grammar mid-stream as the parser scans the input from left to
right.
> > > From
> > > what I've seen so far, it appears fundamentally difficult or
> > > impossible to
> > > make a packrat parser support stateful grammars efficiently
without
> > > effectively turning it into a deterministic (e.g., LR) parser.
> >
> > So, the actions are the problem for everyone :)
> >
> > Ter
> > --
> > Professor Comp. Sci., University of San Francisco
> > Creator, ANTLR Parser Generator, http://www.antlr.org
> > Co-founder, http://www.jguru.com
> > Co-founder, http://www.knowspam.net enjoy email again!
> > Co-founder, http://www.peerscope.com pure link sharing
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list