[antlr-interest] Source positions for imaginary tokens

Wed Sep 12 13:11:49 PDT 2012

Mike,

I look forward to receiving you solution for this simple problem.

Jim

> -----Original Message-----
> From: Mike Lischke [mailto:mike at lischke-online.de]
> Sent: Wednesday, September 12, 2012 12:41 AM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Source positions for imaginary tokens
>
>
> Jim,
>
> > It is not a bug with the C target, as I have explained on numerous
> > occasions. The other targets rely on method signatures to select the
> > correct re-write. This is not available in C.
>
> Sorry, have never seen such an explanation with all my searches I have
> done already in this list. You know all the internals surely way better
> than I do, but what is specifically missing that you can't create a
> virtual token with info from another token? Making a construct like
> DUM[$lb] working doesn't sound very complicated.
>
> >
> > However, the information is erroneous anyway. Look at the generated
> > code and you will see that only root nodes are fixed up with
> positional info.
> >
> > Finally, rewriting like that is very expensive. I don't recommend it
> > anyway.
>
> You are probably referring to the complete original example while I'm
> specifically after a simple way to change properties of a token
> (especially when it can be written target independently). A good
> example is the list of keywords, which must sometimes be interpreted as
> normal identifiers, so what would be simple is something like:
>
> keywords:
> 	(
> 	kw = KEYWORD1
> 	| kw = KEYWORD2
> 	...
> 	)
> 	-> ID[$kw]
> ;
>
> There's no separate info necessary I'd say, everything is there, but
> still, the C target produces incorrect code (using kw like a string
> IIRC).
>
> So what I do now (as I really need this) is:
>
> keywords:
> 	KEYWORD1
> 	| KEYWORD2
> 	...
> ;
> finally
> {
> 	retval.start->setType(retval.start, IDENTIFIER); }
>
> which is rather a hack IMO, but the simplest solution I could come up
> with. I'm all ears for better solutions, if there's any.
>
> Btw. when a feature really cannot be implemented in the C target,
> wouldn't it be better to write out some error message that the compiler
> complains about, so the grammar developer knows he cannot use this
> feature, instead letting him believe all is fine? Otherwise he's
> condemned to debug the grammar until he finds out the produced code is
> wrong (which can take quite some time when working with big grammars
> where loading the parser into the editor can easily take 20-30 secs).
>
> Mike
> --
> www.soft-gems.net
>