[antlr-interest] CommonTree & Tree grammar versus DIY
Gerald Rosenberg
gerald at certiv.net
Thu Aug 21 14:04:56 PDT 2008
At 11:07 AM 8/21/2008, Terence Parr wrote:
>On Aug 20, 2008, at 7:42 PM, Gerald Rosenberg wrote:
>
>>Antlr could directly generate at least the low-level API. For
>>example, consider an AST that is the underlying data structure for
>>an HTML editor. A grammar to generate the desired API might be
>>specified as:
>>
>> access grammar html;
>>
>> start_tag : open_tag ID ^( name ^( attr )* )*
>> => find (int start_node, boolean direction,
>> String $ID.text ) returns [int node_index]
>> => find (int start_node, boolean direction,
>> String $ID.text, String name, String attr ) returns [int node_index]
>> => create (String $ID.text, String name, List attr
>> ) returns [$start_tag.tree]
>> => copy (int node_index) returns [$start_tag.tree]
>> => insert (int node_index, $start_tag.tree)
>> returns [boolean status]
>> => delete (int node_index) returns [$start_tag.tree]
>> ;
>>
>>This is not far off from a tree grammar: tersely abstracted, but
>>still providing sufficient information to unambiguously define
>>implementation of the API. The generated code will be no more
>>fragile than that produced from a tree grammar. Add in
>>heterogeneous tree node support and it is a rather complete
>>solution. Non-trivial, but complete. The devil is in figuring out
>>the appropriate grammar syntax for defining the API productions --
>>what is shown is good for discussion, but probably not much more.
>
>So, ANTLR's job would be to fill in those find/create/... methods?
Exactly.
>I'm not sure he could figure that out from the argument list.
The necessary information content is there. For example, consider the
equivalence of:
>> access grammar html;
>>
>> start_tag : open_tag ID ^( name ^( attr )* )*
>> => find (int start_node, boolean direction, String
>> $ID.text, , String name, String attr ) returns [int node_index] ;
with:
tree grammar html;
start_tag : open_tag { if (direction &&
$open_tag.node_index > start_node) else if (!direction &&
$open_tag.node_index <= start_node) }?
ID { $ID.text.equals("someIDString") }?
^( n=name ^( a=attr {
$n.equals("someNameString") && $a.equals("someAttrString") }? )+ )+
-> { return $open_tag.node_index } ;
Likewise, you could emulate the remaining functionality of the access
grammar with a set of tree grammars; separate grammars would be
needed for each node type and API operation. (The tree grammar
syntax, as used in this manner, is messy/noisy and the complex of
tree walkers produced would be clumsy to orchestrate -- better to
have a clean, purpose defined grammar syntax that directly produces a
conventional-looking API.)
So, to answer your concern, the given structure of the node is
sufficient to define the scope/nesting of where the elements of the
argument list need to be tested. It is implicitly being done in
standard tree rewrites -- basically the same as figuring out where to
put the TYPE and DEF:
tree grammar html2;
start_tag : open_tag ID ^( name ^( attr )* )*
-> open_tag ID TYPE ^( name ^( DEF attr )* )* ;
The production grammar syntax needs to be better designed to make the
intent of the access grammar more explicit -- as previously noted,
the syntax shown is good for discussion, but probably not much more.
More information about the antlr-interest
mailing list