[antlr-interest] Re: a new paper on ANTLR style grammars

Fri Nov 21 15:57:14 PST 2003

You are right. The worst case scenario hardly applies at all...
Remember, you once pointed me to GLR parsing as done in recent bison
versions?! It has to deal with actions as well. Quite funny way it
does it:

http://www.delorie.com/gnu/docs/bison/bison_90.html

have a look at the lower part...

Oliver

--- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> --- In antlr-interest at yahoogroups.com, "Oliver Zeigermann"
<oliver at z...> wrote:
> > Loring, thanks for the substantial input and taking this seriously :)
> > 
> > --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> > > You can do better by deferring the actions--basically, build a 
> > > monster case statement including all of the possible actions in a 
> > > grammar--and execute them after matching a rule. 
> > 
> > *After* exactly is the problem. It is pretty easy to execute actions
> > after a derivation / reduce, but not while shifting. This may be
> > desirable though...
> 
> Sorry if I was unclear.  Actions need to execute during a parse only
if they affect the parse--you need semantic predicates for that.  
> What you can do is build a list of operations to be executed later.
> 
> > > Then you can 
> > > trigger a set of actions at "commit" points.  Functional languages 
> > > make this sort of lazy evaluation easier.  
> > 
> > Where should those commit points be?
> 
> The simple answer is "reduce" operations in an LR parser, but I
don't think that that is a complete answer.
> 
> > > Rolling back actions is 
> > > trickier--you have to have some sort of mechanism to record state, 
> > > or the cost for checkpoint/rollback is very high.
> > 
> > Rolling back / forward does not come for free, agreed! But, if
> > implemented reasonably expenses may at most be doubled. Compared to
> > worst case exponential costs of backtracking this is not so bad.
> 
> Checkpoint/rollback only affects action state maintenance, not the
parsing path.  Backtracking still happens.  Besides, the worst 
> case scenario is a worst case scenario and would occur in practice
only for truly horrendous languages which are probably not 
> manageable in any other way.  BTW, exponential overhead occurs only
if there is a syntactic predicate on almost every alternative 
> (with ANTLR)--in practice, it is hard to get more than about 10%
overhead from a reasonable problem.  Building a state machine with 
> deferred actions will incur little overhead and achieve the same effect.
> 
> > > For that matter, you can defer all actions until the entire grammar 
> > > is recognized as long as the actions do not affect the parse.  
> > > Editing of a generated tree might be trickier, but not that 
> > > difficult--you just insert code for that in the list of deferred 
> > > actions.
> > 
> > We discussed this before - but still - writing to and afterwards
> > reading from a symbol table is quite a usual thing. This can not be
> > expressed merely by means of CFGs (or less), you need actions and
> > semantic checks (i.e. predicates) here.
> 
> Not quite--I'm afraid that my explanation was a bit terse.  The
point is to do tree construction via deferred actions and achieve 
> edit-during-construction through inserting helper actions into the
list.  You can do all of the normal symbol table stuff through deferred 
> actions--what you cannot do is mix them with semantic predicates.
> 
> --Loring
> 
> > Other than that, to show my colors: I am a big fan of ASTs in genereal
> > and ANTLR tree transformation in particular :)
> > 
> > Oliver
> > 
> > > --Loring
> > > 
> > > 
> > > --- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" 
> > > <oliver at z...> wrote:
> > > > What I wanted to say was: If you have sematic actions associated
> > to
> > > > your grammar that can be inserted (and of course executed) at any
> > > > point and you have a table driven approach you are in trouble. 
> > > This is
> > > > because what I understand as the precomputation of a search tree 
> > > into
> > > > a table combining certain *search* states. Extrapolating from
> > what 
> > > I
> > > > know about LR you have a problem when youe have a grammar like 
> > > this:
> > > > 
> > > > a : A A { do something here } A A B ;
> > > > b : A A { do something different here } A A C ;
> > > > 
> > > > upon input
> > > > 
> > > > AAAAC
> > > > 
> > > > This is because the parser has no idea which action to execute 
> > > here.
> > > > Now, my idea was to execute both and roll back the action of rule 
> > > a as
> > > > soon as it is clear that rule b actually matches.
> > > > 
> > > > A bit clearer now?
> > > > 
> > > > Oliver
> > > > 
> > > > --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> 
> > > wrote:
> > > > > 
> > > > > On Wednesday, November 19, 2003, at 03:12 PM, Oliver Zeigermann 
> > > wrote:
> > > > > 
> > > > > > Actually made it through the paper while getting nervous with 
> > > the
> > > > > > proofs ;)
> > > > > >
> > > > > > While he has linear time "backtracking" performance, ANTLR is 
> > > worst
> > > > > > case exponential. I was wondering why: ANTLR does not combine 
> > > its
> > > > > > depth first search (aka backtracking in this context) into a 
> > > table
> > > > > > while Bryan's approach does (at least I understand it this 
> > > way). The
> > > > > > problem Bryan will come across (given my understanding is 
> > > halfway
> > > > > > correct) is ACTIONS. As with LR and combined states, the 
> > > problem is
> > > > > > when to execute associated semantic actions. The drawback is 
> > > well
> > > > > > known and and leads to reduction in parsing power.
> > > > > >
> > > > > > Might sound weird, but I thought if we still combined states 
> > > even
> > > > > > though they are associated with different actions and simple 
> > > execute
> > > > > > all actions, there would be no loss of power :) Silly? Not if 
> > > you have
> > > > > > a transactional language that allows you to roll back actions 
> > > that
> > > > > > turn out to be invalid later and roll forward the valid ones.
> > > > > >
> > > > > > Technically this is possible. Does it make sense as well? Am
> > I 
> > > slowly
> > > > > > going crazy? ;)
> > > > > 
> > > > > You are already crazy like me ;) <snicker, snort>.  Just got 
> > > mail from 
> > > > > him. :)  Hope it's ok to repeat part here:
> > > > > 
> > > > > > - Packrat parsing guarantees linear-time parsing on all the 
> > > types of 
> > > > > > grammars
> > > > > > it supports, which amounts to everything that fits the 
> > > formalism or
> > > > > > "conceptual model" of parsing expression grammars.  But 
> > > the "pure"
> > > > PEG 
> > > > > > model
> > > > > > doesn't directly support "stateful" grammars like those of C 
> > > and C++, 
> > > > > > in
> > > > > > which you have to build up symbol tables and such that 
> > > effectively 
> > > > > > modify the
> > > > > > grammar mid-stream as the parser scans the input from left to 
> > > right.  
> > > > > > From
> > > > > > what I've seen so far, it appears fundamentally difficult or 
> > > > > > impossible to
> > > > > > make a packrat parser support stateful grammars efficiently 
> > > without
> > > > > > effectively turning it into a deterministic (e.g., LR) parser.
> > > > > 
> > > > > So, the actions are the problem for everyone :)
> > > > > 
> > > > > Ter
> > > > > --
> > > > > Professor Comp. Sci., University of San Francisco
> > > > > Creator, ANTLR Parser Generator, http://www.antlr.org
> > > > > Co-founder, http://www.jguru.com
> > > > > Co-founder, http://www.knowspam.net enjoy email again!
> > > > > Co-founder, http://www.peerscope.com pure link sharing

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/