[antlr-interest] How to process blocks of code?

Ric Klaren ric.klaren at gmail.com
Thu Dec 28 12:24:57 PST 2006


Hi,

On 12/28/06, Jan van der Ven <jhvdven at xs4all.nl> wrote:
> I am quite new to grammars, so I had no idea what would be the correct
> phrase to use when searching the archives. I think my problem is quite
> common:
>
> I have a set of statements that need to be syntactically checked. I have
> a g file that does just that for the whole set. However I would like to
> do some more logical processing on each statement separately. So I
> thought to call a function whenever the parser has completed a
> statement. My question now is, how can I find the text/tokens belonging
> to that parsed piece of code?

My guess (or advice rather ;) ) would be that you'd want to make a
multipass parser. E.g. do first a general parse and build a tree and
process it afterwards with several treeparsers to rewrite or enhance,
or to build additional datastructures next to this tree.

Another option is to build trees and call a tree parser from within
the parser.. although a bit less clean and in the wrong hands can turn
less maintainable as the multiple tree parser option (where you can
separate different steps of the process better).

> I thought of chunking the pieces first and then feeding that to the
> parser, but this does not work for all scripts I want to process. As I
> am working on SQL an example may be in order.
>
> This handles mulitple statements:
>
> sql_script :
>     statement (statement_separator (statement)? )* EOF
>     ;
>
> But the CREATE PROCEDURE is a single statement with multiple ones in it:
> create_statement
>     :
>     CREATE PROCEDURE procedure_name (parameter_list)? sql_script
>     | CREATE TABLE table_name
>     | CREATE VIEW table_name select_statement
>     ;
>
> So I think the first alternative is the best way to go:
> statement
>     :
>       select_statement {SQLSyntaxModel.runChecks();}
>
> And we are down to my original question: what was in this statement and
> how can I access it from code?

Turn on the buildAST option for the parser. Then think hard about how
you want to build your tree... e.g. you want to build a tree with a
structure that makes it easier to do the following processing steps
(heck it maybe even take a few iterations to build the tree you
desire). Note that treeparser only have a lookahead of one, so
generally you'd keep this in mind when constructing the trees in the
parser. Also look at imaginary nodes (tokens) with which you can tag
'important' bits in the tree.

Hope this helps,

Ric


More information about the antlr-interest mailing list