[antlr-interest] Re: ANTLR 3 codegen (was: Enhance ANTLR support for comments?)
micheal_jor
open.zone at virgin.net
Sat Jul 19 07:19:10 PDT 2003
> Terence Parr <parrt at c...> wrote:
>
> On Friday, July 18, 2003, at 01:27 PM, micheal_jor wrote:
> > One complicating issue is that depending on the constraints of
> > execution environment, I might want a compact or even more compact
> > representation of these. For instance, I might want to shoehorn
both
> > the line and column numbers into a single 32-bit integer (16:16 or
> > 24:8 split) or leave them as two separate integers.
> >
> > Not sure how ANTLR 3 can support such scenarios easily.
>
> One of the thoughts we had in the cabal was that you would specify
the
> token attributes in an ANTLR formalism and the code generator would
be
> able to decide how to encode in the target language. For example,
you
> might do
>
> token {
> // text and type predefined perhaps
> int start;
> int stop;
> String filename;
> }
In my mental model, there are perhaps four issues involved here:
1) Should the attributes listed above be supported as standard for
*all* ANTLR tokens and AST nodes?
==> "Yes" would be my answer on this. I can't think of a project
where I haven't needed this. "filename" might
become "resourceLocation" or similar if support for sources other
than files (e.g zip archives, urls etc) is added.
2) How can ANTLR [grammars] be extended to support declarative custom
AST-node-attributes specification in a language-neutral manner?
==> Have to think about this a bit. Support for both homogenous and
heterogenous trees makes this a little tricky.
3) Should ANTLR support custom token-attributes and how?
==> I haven't needed to do this except to add filename/line/col info.
What do people think?.
4) How can we ensure that implementation decisions like "should I
store line/col info in two 32-bit ints or a single int?" are properly
left to ANTLR codegens?
==> This really needs head banging together to thrash stuff out.
Would we end up with a set of interfaces each for OO, IMP, FP etc
language families?
> At NeXT I had to encode token type and line number into a 32 bit
int,
> but in other cases it had to be an object. The code generator
could
> generate either depending on options and what attributes you had in
> there.
Cool. Seems you already have a jump on (4) above.
> Just some thoughts we had. We're thinking about language
independence
> pretty heavily since I expect to make building a code generator for
> ANTLR pretty easy.
Hopefully not so easy that the codegens aren't able to make often
drastic implementation decisions as above. Actually, we could have a
two-tier system:
TIER-1: The set of codegen interfaces that result from (4) above
would support the development of fully integrated ANTLR codegens that
require more work to build but in return produce the
fastest/smallest/tighest[/prettiest?] code.
TIER-2: The intermediate form that you describe below on the other
hand could allow anyone to build a very decent codegen in record time.
> Lots of back ends will appear I hope. I'm going to
> go so far as to have a text-based intermediate form (if wanted) so
that
> you don't even have to build the code generator in ANTLR. You
could
> build the python code generator in python for example as it's just
> reading a text file with all the "hard parts" filled in :)
>
> Terence
Cheers,
Micheal
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list