[antlr-interest] 3.0 multiple language support

Thomas Dudziak tomdz at first.gmd.de
Wed Aug 4 09:43:42 PDT 2004


Sebastian Kaliszewski wrote:

>>>I really don't like this. Actions (esp. predicates) are part of the grammar. 
>>>Separating them makes code much harder to read, analyse and maitain, as one 
>>>has to jump around the code text.
>>>Macros might be a nice idea, but they should be intermixable with the 
>>>parser/lexer definition in the same file.
>>
>>I find Michaels solution much easier to maintain. Imagine the Java 
>>grammar with embedded actions for all supported languages - it will 
>>surely be not readable anymore.
> 
> 
> I'm not seeing that as unreadable. But spreading code doing particular 
> (simple) thing among many places in different files is.
> Besides it's easy to filter actions in all but particular language out (if 
> syntax is clear, like: {c++: /* rule here */}). There could be even such 
> action filter tool acompanying ANTLR distribution.

This is the "with the proper tool" argument that was given before.

  >>Defining labels, e.g.
>>"process_some_rule" is far better IMO.
> 
> 
> If someone needs to do some more complicated processing which is logically 
> separate one should put that processing into additional function/procedure 
> -- this is a prime good coding rule in all general purpose languages. But 
> for simple stuff this makes no sense. Simple stuff should be done inline.
> And from my experience most of the in-grammar actions are such simple stuff.
> 
> The separate actions proposition would for example enforce me to do 
> something like that:
> 
> NEW_LINE
>    : "\n" { process_new_line() }
> 
> And then in separate file:
> process_new_line()
> {
>    my_line_count++;
>    nextLine();
> }
> 
> 
> No, thanks... I vote strongly against such a nonsense.

Right, but this is precisely what I was talking about when I said that 
most of these "one-liners" are actually things that are far better 
handled by ANTLR itself. We still have to invoke newline when parsing 
line delimiters which is nothing I would like to have to worry about. 
These are the typical one-liners (newline, set token type etc.) that 
should not have to be in the target language at all (and sometimes 
aren't, already).

>>You can simply add all 
>>information as parameters that is required by the action. The net result 
>>is that the grammars themselves are target-language-independent 
> 
> 
> And in what percentage of uses we need that?

Erm, often ? Thats the point of supporting multiple target languages, or 
not ? Currently every grammar has to be adapted to the various target 
languages which results in different quality because grammar patches for 
one target language have to be ported to other versions even though they 
most of the time have nothing to do with the embedded actions.

>>(remember the newbie asking about a Java version for the SQL grammar ?) 
> 
> 
> That newbie still needs actions for the grammar. And it's easier to look at 
> the code where (mostly simple one liners) actions are placed together with 
> production rules, not in separate files. Like this is one of the advantages 
> of most contemporary languages over C & C++ -- no stupid separatre headers 
> to declare stuff which will be unrolled in another file.

Having to provide a separate file for the action implementation is about 
the only disadvantage of this approach, but I think its providing

>>and you have the action in a concise form, perhaps in a single class. Or 
>>Antlr generates abstract method declarations (for C++/C#/Java) in the 
>>parser that the developer has to override in a concrete parser subclass 
>>(template and hook).
> 
> 
> Sorry, but one of the prime advantages of recursive descent parsing is that 
> actions can be precisely placed, and together with generated code combined 
> into whole methods.

And ? How the weaving is implemented is another matter - it could be 
done via "macro expansion", or via method calls or whatnot.

>>There are a few situations where this might seem like overkill
> 
> 
> This is overkill in vast majority of situations not just few!
> 
> 
> 
>>(e.g. 
>>maintaining and using a counter for a ()* subrule that shall be 
>>traversed like 20 times), but there might be better ways to handle these 
>>situations (e.g. defining syntax for it - say, ()[20] - and letting 
>>Antlr work out the details).
> 
> 
> And have yet another uncompheresible language loaded with [*&%(|\`@#@$ and 
> stuff, i.e. "executable line noise". No, thanks...

Actually, this is used in a quite a few EBNF grammars (and regexp 
stuff). And btw, in comparison to the tree generation stuff, this is 
quite easily understandable even to newbies.

IMO, it would be best to support both strategies ?! And the grammars 
that come with ANTLR, which contain few actions anyway, should be 
language independent.

Tom


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list