[antlr-interest] A simple parser for methods and nested {}

jobapply jobapply at nextmail.ru
Sun Mar 20 14:05:17 PST 2005


I have nested command expressions like

~Command1{ <something1> ~Command2{ <something2> } <something3> } ...

Here <something> is anything including {}, but it is to become plain text in
tree nodes.
Commands are started with ~ and capital letter, curly bracers are
obligatory: ~SomeCommand{...}.

I've never written any grammars before.. 
I've read a heap of docs but as I try to do it - keep failing again and
again.

I imagine that should become something like:

           ~Command1
        /  \          \
       /    \          \
      /      \          \
     /        \          \
something1  ~Command2  something3
                \
		something2


Another example:
~Command{ BLA ~Command{bla{bla}} Bla }
~A{b{}}
~A{b{~C{c}}}

   The last one should be smth like

   ~A
 /  |  \
b{  ~C  }
    |
    c

This is my grammar. It works, but splits {} into separate nodes even when
counter!=0;
Maybe there is a way to prevent such split into tokens or somehow else solve
the problem?

----------------------------

class ExpressionParser extends Parser;
options { buildAST=true; }

tokens {
    COMMAND<AST=CommandNode>;
}

{ public static int curlCounter=0; }


expr: (COMMAND^)? lcurl components rcurl;
lcurl: LCURL { curlCounter++; } ;
rcurl: RCURL { curlCounter--; };
components: component components |;
component: expr | ANY | COMMA!;

class ExpressionLexer extends Lexer;

LCURL: '{';
RCURL: '}';

ANY: (~('~' | '{' | '}' | ','))+;
COMMA: ',';   // yep I have a list of params, but that does not matter

COMMAND: '~' ('A'..'Z')+;


I tried to explain the needs as much as I can.. Sorry it took so long.



More information about the antlr-interest mailing list