[antlr-interest] Questions regarding ANTLRv3.g

Tue Mar 4 21:19:34 PST 2008

Hi,

I'm writing a pretty printer for ANTLR v3 grammars using the ANTLRv3.g grammar from the examples section of the website (http://www.antlr.org/grammar/ANTLR).  In doing so, I have several questions that I'm hoping someone could answer.

1) I couldn't help but notice that there are two files in the Fisheye revision control system (http://fisheye2.cenqua.com/) for ANTLRv3.g (http://fisheye2.cenqua.com/browse/antlr/src/org/antlr/tool/ANTLRv3.g?r=4556 and http://fisheye2.cenqua.com/browse/antlr-examples/java/ANTLR/ANTLRv3.g?r=4288).  They seem to be out of sync.

In the second file ("...4288")  token types include TREE_BEGIN, ROOT, BANG, RANGE, REWRITE; in the first file ("...4556"), these token types are not declared in the tokens section. What is the reason for the difference?

In the first file, ruleScopeSpec has one production, and it specifies that id's do not have comma separation; in the second file, ruleScopeSpec has three productions, and id's have comma separation.  Which is the correct syntax?

In the first file, the symbol SRC has modifier "protected"; in the second file, SRC has modifier "fragment".  I thought "protected" was changed to "fragment" with version 3 of ANTLR, and it would only accept that?

In the first file, ACTION_CHAR_LITERAL was defined as:

fragment
ACTION_CHAR_LITERAL
 : '\'' (ACTION_ESC|.) '\''
 ;

In the second file, its defined as:

fragment
ACTION_CHAR_LITERAL
 : '\'' (ACTION_ESC|~('\\'|'\'')) '\''
 ;

What is the purpose of the difference?  Which is the correct syntax?

In addition, it looks like the build for ANTLR (http://fisheye2.cenqua.com/browse/~raw,r=4540/antlr/build.xml) does not use ANTLRv3.g, but the file antlr.g.  That file is an ANTLR version 2 input grammar, along with several other ".g" files the comprise the build.

What is the situation with the ANTLRv3.g grammar?  Will they be kept in sync and will the new grammar be used in the build for ANTLR?

2) It looks like the example grammar http://fisheye2.cenqua.com/browse/antlr-examples/java/ANTLR/ANTLRv3.g?r=4288 seems to have a bug in the tree construction for the third production of elementNoOptionSpec, in which the ebnfSuffix is completely lost from the tree:

elementNoOptionSpec :
 atom
  ( ebnfSuffix -> ^(BLOCK["BLOCK"] ^(ALT["ALT"] atom EOA["EOA"]) EOB["EOB"])
  |    -> atom
  )

For example, if one runs ANTLRWorks using ANTLRv3.g (http://www.antlr.org/grammar/ANTLR/ANTLRv3.g) with the input:

grammar test;

a : 'A'
  | 'B' a?
  | 'C' a*
  | 'D' (a)?
  | 'F' (a)*
  ;

then the AST constructed does not seem to have any nodes for '*' nor '?' in the 2nd and 3rd productions.  I can only guess that the rule should have been:

elementNoOptionSpec :
 atom
  ( ebnfSuffix -> ^( ebnfSuffix ^(BLOCK["BLOCK"] ^(ALT["ALT"] atom EOA["EOA"]) EOB["EOB"]))
  |    -> atom
  )

Is this right?

3) Incidentally, when I debug the ANTLRv3.g grammar on ANTLRWorks v1.1.7, it seems to produce nice looking trees, but it also has one or more "javax.swing.text.BadLocationException" being raised.  Is there a more recent version of ANTLRWorks that I can build?  It doesn't look like the source is in Fisheye.  Where can I find it?

Thanks.

Ken Domino
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080305/4643e028/attachment.html