[antlr-interest] Re: a new paper on ANTLR style grammars

Thu Nov 20 11:53:21 PST 2003

--- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" <oliver at z...> wrote:
> Loring, thanks for the substantial input and taking this seriously :)
> 
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > You can do better by deferring the actions--basically, build a 
> > monster case statement including all of the possible actions in a 
> > grammar--and execute them after matching a rule. 
> 
> *After* exactly is the problem. It is pretty easy to execute actions
> after a derivation / reduce, but not while shifting. This may be
> desirable though...

Sorry if I was unclear.  Actions need to execute during a parse only if they affect the parse--you need semantic predicates for that.  
What you can do is build a list of operations to be executed later.

> > Then you can 
> > trigger a set of actions at "commit" points.  Functional languages 
> > make this sort of lazy evaluation easier.  
> 
> Where should those commit points be?

The simple answer is "reduce" operations in an LR parser, but I don't think that that is a complete answer.

> > Rolling back actions is 
> > trickier--you have to have some sort of mechanism to record state, 
> > or the cost for checkpoint/rollback is very high.
> 
> Rolling back / forward does not come for free, agreed! But, if
> implemented reasonably expenses may at most be doubled. Compared to
> worst case exponential costs of backtracking this is not so bad.

Checkpoint/rollback only affects action state maintenance, not the parsing path.  Backtracking still happens.  Besides, the worst 
case scenario is a worst case scenario and would occur in practice only for truly horrendous languages which are probably not 
manageable in any other way.  BTW, exponential overhead occurs only if there is a syntactic predicate on almost every alternative 
(with ANTLR)--in practice, it is hard to get more than about 10% overhead from a reasonable problem.  Building a state machine with 
deferred actions will incur little overhead and achieve the same effect.

> > For that matter, you can defer all actions until the entire grammar 
> > is recognized as long as the actions do not affect the parse.  
> > Editing of a generated tree might be trickier, but not that 
> > difficult--you just insert code for that in the list of deferred 
> > actions.
> 
> We discussed this before - but still - writing to and afterwards
> reading from a symbol table is quite a usual thing. This can not be
> expressed merely by means of CFGs (or less), you need actions and
> semantic checks (i.e. predicates) here.

Not quite--I'm afraid that my explanation was a bit terse.  The point is to do tree construction via deferred actions and achieve 
edit-during-construction through inserting helper actions into the list.  You can do all of the normal symbol table stuff through deferred 
actions--what you cannot do is mix them with semantic predicates.

--Loring

> Other than that, to show my colors: I am a big fan of ASTs in genereal
> and ANTLR tree transformation in particular :)
> 
> Oliver
> 
> > --Loring
> > 
> > 
> > --- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" 
> > <oliver at z...> wrote:
> > > What I wanted to say was: If you have sematic actions associated
> to
> > > your grammar that can be inserted (and of course executed) at any
> > > point and you have a table driven approach you are in trouble. 
> > This is
> > > because what I understand as the precomputation of a search tree 
> > into
> > > a table combining certain *search* states. Extrapolating from
> what 
> > I
> > > know about LR you have a problem when youe have a grammar like 
> > this:
> > > 
> > > a : A A { do something here } A A B ;
> > > b : A A { do something different here } A A C ;
> > > 
> > > upon input
> > > 
> > > AAAAC
> > > 
> > > This is because the parser has no idea which action to execute 
> > here.
> > > Now, my idea was to execute both and roll back the action of rule 
> > a as
> > > soon as it is clear that rule b actually matches.
> > > 
> > > A bit clearer now?
> > > 
> > > Oliver
> > > 
> > > --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> 
> > wrote:
> > > > 
> > > > On Wednesday, November 19, 2003, at 03:12 PM, Oliver Zeigermann 
> > wrote:
> > > > 
> > > > > Actually made it through the paper while getting nervous with 
> > the
> > > > > proofs ;)
> > > > >
> > > > > While he has linear time "backtracking" performance, ANTLR is 
> > worst
> > > > > case exponential. I was wondering why: ANTLR does not combine 
> > its
> > > > > depth first search (aka backtracking in this context) into a 
> > table
> > > > > while Bryan's approach does (at least I understand it this 
> > way). The
> > > > > problem Bryan will come across (given my understanding is 
> > halfway
> > > > > correct) is ACTIONS. As with LR and combined states, the 
> > problem is
> > > > > when to execute associated semantic actions. The drawback is 
> > well
> > > > > known and and leads to reduction in parsing power.
> > > > >
> > > > > Might sound weird, but I thought if we still combined states 
> > even
> > > > > though they are associated with different actions and simple 
> > execute
> > > > > all actions, there would be no loss of power :) Silly? Not if 
> > you have
> > > > > a transactional language that allows you to roll back actions 
> > that
> > > > > turn out to be invalid later and roll forward the valid ones.
> > > > >
> > > > > Technically this is possible. Does it make sense as well? Am
> I 
> > slowly
> > > > > going crazy? ;)
> > > > 
> > > > You are already crazy like me ;) <snicker, snort>.  Just got 
> > mail from 
> > > > him. :)  Hope it's ok to repeat part here:
> > > > 
> > > > > - Packrat parsing guarantees linear-time parsing on all the 
> > types of 
> > > > > grammars
> > > > > it supports, which amounts to everything that fits the 
> > formalism or
> > > > > "conceptual model" of parsing expression grammars.  But 
> > the "pure"
> > > PEG 
> > > > > model
> > > > > doesn't directly support "stateful" grammars like those of C 
> > and C++, 
> > > > > in
> > > > > which you have to build up symbol tables and such that 
> > effectively 
> > > > > modify the
> > > > > grammar mid-stream as the parser scans the input from left to 
> > right.  
> > > > > From
> > > > > what I've seen so far, it appears fundamentally difficult or 
> > > > > impossible to
> > > > > make a packrat parser support stateful grammars efficiently 
> > without
> > > > > effectively turning it into a deterministic (e.g., LR) parser.
> > > > 
> > > > So, the actions are the problem for everyone :)
> > > > 
> > > > Ter
> > > > --
> > > > Professor Comp. Sci., University of San Francisco
> > > > Creator, ANTLR Parser Generator, http://www.antlr.org
> > > > Co-founder, http://www.jguru.com
> > > > Co-founder, http://www.knowspam.net enjoy email again!
> > > > Co-founder, http://www.peerscope.com pure link sharing

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/