[antlr-interest] C Target updates

Thu Apr 19 18:30:03 PDT 2007

The C target has been substantially updated over the last few days, with
a number of improvements and bug fixes as well as the addition of
examples for using the C target (should be available from FishEye
shortly, examples tree was missing from FishEye and Ter has now
requested it as an addition).

However, until Ter releases a formal update, which I will assumed is b9,
if you want to take advantage of the examples, you will need to get the
latest distribution from the source tree (tagged as b8, but I realize
now that it should be b9 really, and I will correct this shortly) and
build the runtime yourself, and you will also need to build the ANTLR
jar so as to include the new C.stg file. It may be easier to wait for
b9, which is not too far away I would imagine (post Ter's changes to
tree rewriting I guess).

I shall be adding the rest of the example tree to the source as soon as
possible, which I hope will engender/enable more use of the C target and
aid those of you who have been struggling a little without some formal
examples. Currently, the examples include: C Parser, Java Parser, island
grammar (how to step out of the lexer and parse a completely different
'language' then revert to the originating language), global and rule
dynamic scope, hoisted predicates, and filtering lexer. There are a few
remaining examples to convert to the C target but they are more repeats
of the same thing, other than those that use stringtemplates, to whit I
must decide on a strategy yet.

For those that have grammars targeted at C, you should not the following
changes, which may affect your existing grammars, so this is a heads up
to help solve any seemingly strange issues, or help you provide feedback
to me on anything that used to work that now does not. Firstly, things
that have changed that probably won't affect you negatively:

1)      The memorize option now works. This obviously affects code
generation a little, so it is possible that I broke something while
implementing this, though I don't predict this to be the case;

2)      The backtracking option is now 'fully' tested, especially in
conjunction with memorize=true; Again, this affects code generation;
though I have tested what I can, it is possible that I have broken
things - the changes were minor, so I do not expect this to be the case;

3)      Error recovery is based on bitsets defining tokens that can
follow any particular point. I finally got around to making these static
declarations, which speeds up the initialization process for
{Tree}Parsers. If your error recovery/reporting changes at all, please
let me know. My next task is to update the default error reporting
functions which I know have at least 2 bugs that could cause sigsegv,
though I expect most people will override these.

4)      I have noticed, by virtue of the C grammar which both backtracks
and memoizes, that at least in debug mode, the code executes slower than
I would like. I think I know why (default options for hashtables) but I
am using Vista as a development platform and Compuware won't let me have
the beta that supports that platform and Parallels for Windows has bugs
that their pathetic support is not helping me with - I need to buy
another MAC!).

There is only one change that may well affect your grammars as they
stand, and that is that I have changed all the macros defined for both
easier code generation and ease of use by C grammar programmers from
camel case to the more traditional UPPER case used by C programmers. My
original idea was that the Java examples would more easily translate to
C usage, but I now find that it is too annoying for C programmers (I.E.
me) so instead of getText() you now have GETTEXT() which is more
obviously a macro. Anythign like this that used to work is probably
solved by UPPER casing what you have.

Things to do:

1)      Profile and memory check again with BoundsChecker;

2)      Verfiy codegen on more platforms - should not be an issue, but
thanks to those that have access to C compilers that I do not such as
the Intel compiler;

3)      Update the wiki with extensive explanation on how to do things;

4)      Correct typos in comments - I have had more feedback on this
than anything else, and my explanation is that I type at the furious
rate of 100000000 characters a second, but most of what I type is <BS>
;-)

5)      Sell soul to devil;

Feedback with respect always appreciated,

Jim

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070419/852b14b7/attachment-0001.html