[antlr-interest] Incremental TreeWalker (was: What's the SLK
Parser Generator?)
Brian Smith
brian-l-smith at uiowa.edu
Wed Oct 16 21:51:47 PDT 2002
mrosgood wrote:
> Hi Monty Zukowski-
> Thanks for the reply! I'm glad the ANTLR intelligentsia are thinking
> about this stuff. <grin>
>
>>Cook & Welsh's approach is kinda hacky too in that they
>>assume you want to work with a parse tree and not an AST.
>>...
>
> I agree; I'd rather just see the AST.
>
> Cook and Welsh's innovation is efficiently handling substreams of
> input which are temporarily syntactically incorrect. That's why they
> retain the parse tree. There might be other solutions.
I think it would be much easier to implement an incremental parser by
using a parse tree than with ANTLR's AST, because the parse tree is
basically helping you to remember what productions you went through to
parse some section of the document. Then, it is just a matter of
figuring out how the changes in the token stream correlate to the
productions you used. I was thinking about maybe building a parse tree
builder on top of ANTLR's AST mechanism (by creating one "fake" token
for each production rule, and then creating a subclass of the code
generator that creates AST's where every non-leaf node is rooted at one
of these fake tokens, and all the leaves are tokens). But, obviously
action code and also probably predicates (semantic and syntactic) would
cause a problem for this kind of approach.
>>For instance in java you could make an entry point into
>>the java grammar to recognize methods. As they edit the
>>method just keep running that rule and replacing the AST
>>and updating the symbol table.
>
> Are you suggesting parser support for incremental treewalkers? <grin>
> If so, I second the idea.
Bogdan posted something (SATC) to the files section of the mailing list
that seems like it could be used as the foundation for this type of
approach to ad-hoc incremental parsing. His parser isn't incremental but
it deals with syntactic and lexical errors in a respectable way.
> But wouldn't it be great if we could use our grammars as-is? A
> generalized approach would have the most benefit.
My changes to the NetBeans lexer module can already use lexical grammars
basically as-is, with a few restrictions. For example, you have to
replace Token.SKIP with a TokenStreamFilter, because token skipping is
generally only to be used by batch parsers, not by incremental parsers
and text editors. Also, it seems to work better to be able to recognize
incomplete tokens for incremetal analysis, instead of just considering
partial tokens to just be an error (e.g. recognize ["asdf] as a string
literal). To do this in a transparent way, all you need is a slightly
modified code generator (I already made one for Java). Or, you can do it
explicitly in your lexical grammar.
> I'm working through the incremental papers I've collected, trying to
> grok the algorithms. Maybe the ideas can be repurposed for ANTLR. As
> you noted, ANTLR gnerated rules can't be invoked "from the outside".
> So maybe ANTLR can be tweaked so that rules can accept parser state.
A simple approach is just to create a specialized code generator and/or
preprocessor that can auto-generate the "from the outside" rules
automatically.
> I'll also be on the lookout for incremental treewalker ideas. In
> terms of my Giraffe language, having Swing buttons and panels flash in
> and out of existance as the AST changed would be boss.
I would be interested in hearing about anything that you learn. I have a
very similar requirement. However, I don't need anything that is
TreeWalker-specific, although I can use a TreeWalker if it is beneficial
to do so.
- Brian
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list