[antlr-interest] Language-Neutral Actions

Mon Apr 18 04:10:50 PDT 2005

(I wasn't recommending the "@" notation - just an example from some old
parser generator...)

> What is the main purposes of action code in grammar? I think 
> it is 1. Tree building.

Not everyone writes trees. (I have yet to write one other than for my
compiler class, and that was with PCCTS and I did my own custom trees...)
My main uses of ANTLR:
* small expression parsers that are part of bigger tools, using action code
to perform evaluations
* xml readers, building som data structure (not ANTLR trees)

I think trees are useful if there are intermediate steps to perform, like
optimization, or you're writing something that has multiple language specs
but a common IR that you want to process. For anything else, I think its an
unnecessary middle structure.

[Now if "trees" could be any old JavaBean structure, as opposed to a fixed
AST interface definition, I'd be very interested, as you should be able to
build any data structure you want... But that's another concept...]

I usually have a "behavior" interface (a strategy) that has methods that are
called in the action code.

The only other code I typically have in actions is something to "gather"
stuff in loops. For example:

functionCall
  {
      List args = new ArrayList();
      Object arg;
  }
  : id:IDENT LPAREN
    ( arg=value { args.add(arg); } )*
    RPAREN
    { behavior.doFunctionCall(id, args); }
  ;

I could use the behavior to gather args (making it stateful):

functionCall
  {
      Object arg;
  }
  : id:IDENT LPAREN
    ( arg=value { behavior.addArg(arg); } )*
    RPAREN
    { behavior.doFunctionCall(id); }
  ;

then with the new support for named rule calls (I might be off on the
syntax):

functionCall
  : IDENT LPAREN
    ( value { behavior.addArg($value); } )*
    RPAREN
    { behavior.doFunctionCall($IDENT); }
  ;

and we could abstract it to have "behavior" implied:

options {
  languageNeutral=true;
}
....

functionCall
  : IDENT LPAREN
    ( value { addArg($value); } )*
    RPAREN
    { doFunctionCall($IDENT); }
  ;

though I'd like to do something cleaner with the list of values. It would be
cool to be able to somehow label a (...)* to indicate that we want to
capture it as a "list":

options {
  languageNeutral=true;
}
....

functionCall
  : IDENT LPAREN args:( value )* RPAREN
    { doFunctionCall($IDENT, args); }
  ;

or something like that. Nice and clean and readable ;)

> 2. Semantic predicates to handle difficult cases.
> (If I not miss something important) All other code should be 
> separated from grammar for better decomposition and easer maintenance.

Definitely -- there shouldn't be logic in there...

Later,
-- Scott