[antlr-interest] [v3] not including text in token. Still possible?

Kay Roepke kroepke at dolphin-services.de
Tue Feb 7 13:51:39 PST 2006


Moin moin people!

On 7. Feb 2006, at 1:44 Uhr, Terence Parr wrote:
> Some time ago, I fell on my keyboard, yielding:
>> Oh, I was talking about the case where I don't have any bangs in  
>> the rule. But from entering the rule I cannot look forward onto  
>> all the atoms within
>> the rule, can I? (Or rather, the tree walker isn't doing that.)
>
> If there are no bangs lexically present in the rule, don't generate  
> the special code. :)

In my infinite naiveté I actually succeded ;)

>> So in the general case without any bangs, I don't want to create  
>> any string. I want to rely on the indices into the input buffer.
>
> Yep.  Only do new stuff if you see a !.  Even if it's in a (..)?  
> block.

Alright, the first version is now working.
What I do is the following:
Upon entering a rule (in codegen.g) I do a #rule.findFirstType(BANG)  
to determine whether this rule has a bang somewhere inside.
If that's true (i.e. the result non-null) I turn on a flag that gets  
set as an attribute in the lexerRule template.
Then I create a new empty StringBuffer.
If I see a bang attached to an atom I grab the text from start to  
getCharIndex()-1 and put that into the StringBuffer. After the
match I set start=getCharIndex(). Repeat.
At the end of the rule I emit a token with the new emit method that  
takes a String as parameter instead of start,stop.

Done. Works beautifully. Does it sound the least sensible, or did I  
do something horrible? ;)

-k 


More information about the antlr-interest mailing list