[antlr-interest] Incremental TreeWalker (was: What's the SLK Parser Generator?)

Wed Oct 16 21:51:47 PDT 2002

mrosgood wrote:
> Hi Monty Zukowski-
> Thanks for the reply!  I'm glad the ANTLR intelligentsia are thinking
> about this stuff. <grin>
> 
>>Cook & Welsh's approach is kinda hacky too in that they 
>>assume you want to work with a parse tree and not an AST.  
>>...
> 
> I agree; I'd rather just see the AST.
> 
> Cook and Welsh's innovation is efficiently handling substreams of
> input which are temporarily syntactically incorrect.  That's why they
> retain the parse tree.  There might be other solutions.

I think it would be much easier to implement an incremental parser by 
using a parse tree than with ANTLR's AST, because the parse tree is 
basically helping you to remember what productions you went through to 
parse some section of the document. Then, it is just a matter of 
figuring out how the changes in the token stream correlate to the 
productions you used. I was thinking about maybe building a parse tree 
builder on top of ANTLR's AST mechanism (by creating one "fake" token 
for each production rule, and then creating a subclass of the code 
generator that creates AST's where every non-leaf node is rooted at one 
of these fake tokens, and all the leaves are tokens). But, obviously 
action code and also probably predicates (semantic and syntactic) would 
cause a problem for this kind of approach.

>>For instance in java you could make an entry point into 
>>the java grammar to recognize methods.  As they edit the 
>>method just keep running that rule and replacing the AST 
>>and updating the symbol table.
> 
> Are you suggesting parser support for incremental treewalkers? <grin>
>  If so, I second the idea.

Bogdan posted something (SATC) to the files section of the mailing list 
that seems like it could be used as the foundation for this type of 
approach to ad-hoc incremental parsing. His parser isn't incremental but 
it deals with syntactic and lexical errors in a respectable way.

> But wouldn't it be great if we could use our grammars as-is?  A
> generalized approach would have the most benefit.

My changes to the NetBeans lexer module can already use lexical grammars 
basically as-is, with a few restrictions. For example, you have to 
replace Token.SKIP with a TokenStreamFilter, because token skipping is 
generally only to be used by batch parsers, not by incremental parsers 
and text editors. Also, it seems to work better to be able to recognize 
incomplete tokens for incremetal analysis, instead of just considering 
partial tokens to just be an error (e.g. recognize ["asdf] as a string 
literal). To do this in a transparent way, all you need is a slightly 
modified code generator (I already made one for Java). Or, you can do it 
explicitly in your lexical grammar.

> I'm working through the incremental papers I've collected, trying to
> grok the algorithms.  Maybe the ideas can be repurposed for ANTLR.  As
> you noted, ANTLR gnerated rules can't be invoked "from the outside". 
> So maybe ANTLR can be tweaked so that rules can accept parser state.

A simple approach is just to create a specialized code generator and/or 
preprocessor that can auto-generate the "from the outside" rules 
automatically.

> I'll also be on the lookout for incremental treewalker ideas.  In
> terms of my Giraffe language, having Swing buttons and panels flash in
> and out of existance as the AST changed would be boss.

I would be interested in hearing about anything that you learn. I have a 
very similar requirement. However, I don't need anything that is 
TreeWalker-specific, although I can use a TreeWalker if it is beneficial 
to do so.

- Brian

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/