[antlr-interest] Reuse tokens from multiple grammars in tree grammar

Robin diabeteman at gmail.com
Mon Nov 7 10:30:21 PST 2011


Thanks Jim :)

There may be something I didn't understand well about tree grammars.

If I want to write a tree grammar (JavaToGeneric) that translates a Java
AST to a "generic" AST. Do I need the Java grammar tokens to be included in
the tokenVocab of my JavaToGeneric grammar? If not, how to I rewrite/filter
the Java AST?

As I'm having a hard time explaining what I'm trying to do, let me give an
example:

Let's consider a Java AST produced by the grammar files I attached to this
message. If I want a tree grammar that only renames the imaginary tokens
(tags of the AST?), do I only have to copy/paste all the rules of
Java15TreeParser.g and change the tokenVocab to 'commontokens'?

What if I want to actually modify the "shape" of the AST? Like getting some
node information and move it somewhere else in the tree.

What I'm trying to do is analog to converting an XML format to another by
using XSL transform sheets. Can ANTLR be used as a
"XSL-transform-sheet-for-ASTs"?

Sorry about the silly questions :)

Robin
Robin

On Mon, Nov 7, 2011 at 6:50 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> First create the lexer for say the Java language and generate it. You will
> see you get a .tokens file that looks like:
>
> CLASS=5
> IF=6
>
> and so on. You don't need to do this bit, but it shows what a .tokens file
> should look like.
>
> Now, take control of this file away from your lexer by renaming it to
> commontokens.tokens or something similar.
>
> Next, add the token names of all the different lexers and all the
> imaginary tokens you need (you can add to this as you go of course) making
> sure that the numbers you assign are contiguous.
>
> Now, all your grammars share this with:
>
> options {
>
> tokenVocab=commontokens;
>  ...
>
> }
>
> And now you have a common set of tokens and any parser producing an AST
> with such tokens produces a generic AST that you can walk with a single
> tree parser/walker, so long as that walker encompasses all the constructs
> that each individual language might need.
>
> Jim
>
>
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Robin
> > Sent: Monday, November 07, 2011 4:58 AM
> > To: antlr-interest at antlr.org interest
> > Subject: [antlr-interest] Reuse tokens from multiple grammars in tree
> > grammar
> >
> > Hi all,
> >
> > I'm curently working on a thesis project and I need to write tree
> > grammars that translate ASTs produced by several parsers (Java, C, etc)
> > into "generic" ASTs. These "generic" ASTs should only contain basic
> > information about the source code being parsed such as function
> > signatures, class names, etc.
> >
> > I of course thought about ANTLR for this purpose but I'm facing some
> > problems:
> >
> > * How can I define a set of imaginary tokens for this "generic" AST so
> > that they can be reused in tree grammars? (such as JavaToGeneric.g,
> > CToGeneric.g, etc)
> > * If I only want a portion of an AST to be translated, can I use option
> > "filter = true"?
> >
> > I don't know if I've been clear, I could give examples of what I am
> > trying to accomplish if you need. If I'm going the wrong way, please
> > tell me so.
> >
> > Thanks in advance for your help :)
> >
> > Robin
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Java15.g
Type: application/octet-stream
Size: 39854 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20111107/deb3174a/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Java15TreeParser.g
Type: application/octet-stream
Size: 14750 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20111107/deb3174a/attachment-0001.obj 


More information about the antlr-interest mailing list