[antlr-interest] [v3] not including text in token. Still possible?

Kay Roepke kroepke at dolphin-services.de
Sun Feb 5 20:26:18 PST 2006


On 6. Feb 2006, at 5:05 Uhr, Terence Parr wrote:

> Hi!  I haven't figured out to make it do that for v3 yet.  I don't  
> create strings for a token (just indexes into the char buffer) so  
> it's hard to do modifications.  I definitely need it though.

Couldn't you fiddle with the indices into the buffer? Aah no you  
can't. Blimey. This really is a problem of non-contiguous ranges, is  
it not?
If the tokens/text to ignore is always at the start or end of the  
current token it's easy, but once you get one of those in the middle  
you are
SOL. So I guess it's either storing ranges or coming up with some  
really clever idea ;)
Of course, you could argue that the storing of indices is really an  
implementation detail of the CommonToken class and it could copy the
token text if need be without having to tell anyone (i.e. the user).  
One possibility would be to copy the text if you use the not-include- 
this-tokens-text
feature.
OTOH, using ranges shouldn't that much of a problem either. It just  
makes the actual returning of the text a bit more complicated. That's  
probably
still a lot cheaper than to copy it.

>> Also, is there a way to get back the behavior of EA7 when it comes  
>> to printing the tokens of a CommonTokenStream? It used
>> to show a lot of extra information about the tokens. A first  
>> glance at CommonTokenStream.java didn't reveal the secret to me :(
>
> I think it's toDebugString or something...

aah now I see it. The new version does .getText() on each token. ea7  
was doing toString() on them. Ok, just gotta do it manually
then. No trouble.

Thanks,
Kay


More information about the antlr-interest mailing list