[antlr-interest] parse error because of AST rewrite rules?

David-Sarah Hopwood david-sarah at jacaranda.org
Mon Sep 21 16:41:18 PDT 2009


Stefan Oestreicher wrote:
> Hi,
> 
> I'm working on a grammar for a simple programming language and I'd like 
> to support variable declarations like that:
> int a, b = 3
> 
> I started out with the following rule:
> 
> variableDeclaration
>     :   type ID ( COMMA ID )* ( ASSIGN expression )?
>     ;
> 
> Now I'd like to generate an AST for the above example that looks like that:
> ^(VAR_DEF int a)
> ^(VAR_DEF int b)
> ^(= a 3)
> ^(= b a)

Is this your own language? Note that in C, C++, Java, C#, and other
languages with C-derived syntax, "int a, b = 3" would not assign anything
to a; it would leave a uninitialized. It's liable to confuse programmers
of these languages if your language has C-like syntax but differs on this
point. [*]

I also think you may be trying too hard to collapse semantic equivalences
while generating the AST. A declaration initialiser is a different
construct from an assignment. It might (in some language) be equivalent
to an uninitialised declaration followed by an assignment, but an AST is
still an abstract *syntax* tree -- it should preserve syntactic distinctions
such as that between "int a; a = 3;" and "int a = 3;".

(Whether to distinguish between "int a, b;" and "int a; int b;" is
slightly less obvious, but my personal preference would be to preserve
that distinction in the AST as well.)


[*] Whether your language allows uninitialized variables to be *used* is
    a separate issue. Here C and C++ are unsafe and there is good reason
    to diverge from them, as Java and C# do by enforcing "definite
    assignment".

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the antlr-interest mailing list