[antlr-interest] Recovering white space in V3.0

Matthew Ford matthew.ford at forward.com.au
Sat Jun 4 17:25:46 PDT 2005


Hi Ter,

> Hmm...well, having never actually tried it, you might simply walk
> backwards from w.getTokenIndex() instead of remembering where you
> were last time.  The edge case might need something like the token

I don't think walking backwards from the end will do it.
i) the WS come out in the wrong order
ii) in need to display WORD  ws ws ws WORD
not WORD WORD ws ws ws

> starts with a MINUS, so you could just avoid going back past that.
How do you avoid going back pass minus?
Do you use  something like
for ( int i = ...  ;  i> $MINUS.getTokenIndex(); i--)
ie. do I understand you that $MINUS is the minus token (assuming MINUS in
unique in the rule otherwise it is just the last MINUS seen/parsed?)

matthew

----- Original Message ----- 
From: "Terence Parr" <parrt at cs.usfca.edu>
To: "ANTLR Interest" <antlr-interest at antlr.org>
Sent: Sunday, June 05, 2005 7:05 AM
Subject: Re: [antlr-interest] Recovering white space in V3.0


>
> On Jun 4, 2005, at 1:59 PM, Matthew Ford wrote:
>
> > This is what I have so far.
> > WS is ignored  => channel 99
> > but between WORDs I want to get it back
> > So I have used
> >     (
> >     w=WORD
> >       { if (wordsStarted) {
> >         // output all ignored tokens between lastIndex and this index
> >          for (int i=lastIndex+1; i<w.getTokenIndex(); i++) {
> >           System.out.print(input.get(i).getText());
> >          }
> >         } else {
> >           wordsStarted = true;
> >         }
> >         System.out.print(w.getText());
> >         lastIndex = w.getTokenIndex();
> >       }
> >   )*
> >
> >
> > Is there a better way?
>
> Hmm...well, having never actually tried it, you might simply walk
> backwards from w.getTokenIndex() instead of remembering where you
> were last time.  The edge case might need something like the token
> index when you start the rule so you don't go too far back, over WS
> not associated with the rule.  Actually looks like your list rule
> starts with a MINUS, so you could just avoid going back past that.
>
> Another way to handle this is to use the start/stop attributes of any
> rule reference to track the boundaries of a rule and then just print
> anything between it.  For example,
>
> ( list {print between $list.start and $list.stop;} )+
>
> :)
>
> Also note that $WORD will work as the attribute reference if it's
> unique in the alternative.
>
> Terence
>



More information about the antlr-interest mailing list