[antlr-interest] C generator is not generating @after actions
Gavin Lambert
antlr at mirality.co.nz
Thu Feb 5 14:58:15 PST 2009
At 05:31 6/02/2009, Jim Idle wrote:
>You are correct that it isn't exactly semantically equivalent,
>but I have never seen a case where people wanted to do anything
>different. The exception clause would generally be more useful.
>In a realt @after clause, you would have to make sure you checked
>any references for NULL before trying to do anything anyway. So,
>at least for the more usual case, it makes more sense to have the
>action that you want to happen at the rule end, when it is
>successful, in an action at the end. Then, if there is something
>special you need to do upon failure, you want an exception clause
>like the Java target. I think most people are reading @after and
>just see "when this rule finishes successfully".
I think the theory is that if the @init does something that needs
cleaning up after -- such as allocating memory or opening a file
-- then the @after does the cleanup. In practice for C# / Java
targets it's not critical, as the GC will eventually get around to
tidying things up anyway (though not useless, if the rule is
re-entered faster than the GC can tidy up), but for C it'd be more
useful.
Having said that, there's always another way to write the code to
avoid that kind of dependency anyway, so in practice I've never
needed to do it that way.
>I still plan on working through all the possible combinations
>before 3.1.2 is released. The difficulty is not adding an @after
>section of course, but that using a static template, can I make
>sure that all possible code paths, given all possible rule
>element combinations, including backtracking, @after, exceptions
>and so on, thread their way correctly through the generated C
>code and are semantically equivalent to the Java code. The answer
>may well be yes, but I want consider performance and complexity,
>as if there is a semantically equivalent way of expressing this
>in the grammar, then it might make more sense to just instruct
>people in the documentation.
I can't see how you could ever make the C target tolerate
exceptions being thrown mid-rule without turning it into a partial
C++ target :)
Besides, nobody ever reads documentation anyway ;)
>So, in the C target I have removed pretty much all the NULL
>guards as it is better to get a violation than mask
>grammar/coding errors. In the case of the return from a rule, the
>return is in fact a struct, which is declared as such in the
>calling rule. The struct in the calling rule will therefore never
>be NULL, and memsetting it to 0 does not solve that issue, though
>it could have a special field that says if it has been used yet
>and so on.
Memsetting it to 0 will clear the contents of the struct, though,
thereby ensuring that any embedded pointers etc will actually be
NULL and will fail quickly instead of being some random address
that happened to be on the heap (or worse, a valid address put
into a previous instance of the same structure that's being
reused), which will fail subtly rather than obviously and be just
as hard to track down.
I can understand your reasoning, though; once the grammar *is*
doing the right things then the memsets are just wasting
time. But during initial development and debugging they're
invaluable to prevent subtle bugs, as you yourself basically
admitted in the paragraph prior to the one I quoted.
Maybe grammars should have an additional option, telling ANTLR
whether to aim for robustness (thereby including extra sanity
checks, such as the memsets) or for performance (leaving them out,
once the author is happy that their grammar works properly, if
slowly). The default should be for robustness, so that newly
developed grammars get sanity checked. In fact, rather than
memsetting to 0, you could take a page from VC++ and memset to
0xCD when in robust mode (and do nothing in performance mode),
thereby basically guaranteeing a crash in robust mode if someone
tries to use something without initialising it first, since NULL
checks wouldn't work.
(A grammar compiled with -debug should probably also use robust
mode regardless of the option, but there should still be the
separate option for non -debug compiles, since not everyone uses
-debug at all.)
More information about the antlr-interest
mailing list