[antlr-interest] A simple parser for methods and nested {}

Martin Probst mail at martin-probst.com
Mon Mar 21 09:35:36 PST 2005


Hi,

> This is my grammar. It works, but splits {} into separate nodes even when
> counter!=0;
> Maybe there is a way to prevent such split into tokens or somehow else solve
> the problem?

About splitting nodes etc: read the section about tree construction in
the ANTLR documentation, it should really help.

As to your grammar: you can use recursion to keep track of the curly
braces:

expr: (COMMAND^ LCURL component RCURL)*;
component: (expr | ANY | COMMA)*;

This should only accept well-formed command sequences. E.g. if you
encounter
~Foo{ ~bar{...}}

The parser will start with a command and by that enter the "expr" rule.
Then it sees a LCURL. The following "COMMAND" token makes it descend
into "component" rule and then (recursively) back to expr. It will again
parse the LCURL, then the content of the expr ("...") and then the
RCURL. With this token the expr rule is finished and it returns upwards.
Then it sees a RCURL token again. This means the current expr rule is
finished too and the parser can exit.

Btw. you should possibly add an "EOF" after the topmost "expr" rule:
expr: (COMMAND^ LCURL component RCURL)* EOF;
this way the parser will require the whole input to be wellformed. If
there was no EOF token the parser might just stop parsing if it can't
match any more rules.

Regards,
Martin



More information about the antlr-interest mailing list