[antlr-interest] Recovering white space in V3.0

Terence Parr parrt at cs.usfca.edu
Sat Jun 4 14:05:16 PDT 2005


On Jun 4, 2005, at 1:59 PM, Matthew Ford wrote:

> This is what I have so far.
> WS is ignored  => channel 99
> but between WORDs I want to get it back
> So I have used
>     (
>     w=WORD
>       { if (wordsStarted) {
>         // output all ignored tokens between lastIndex and this index
>          for (int i=lastIndex+1; i<w.getTokenIndex(); i++) {
>           System.out.print(input.get(i).getText());
>          }
>         } else {
>           wordsStarted = true;
>         }
>         System.out.print(w.getText());
>         lastIndex = w.getTokenIndex();
>       }
>   )*
>
>
> Is there a better way?

Hmm...well, having never actually tried it, you might simply walk  
backwards from w.getTokenIndex() instead of remembering where you  
were last time.  The edge case might need something like the token  
index when you start the rule so you don't go too far back, over WS  
not associated with the rule.  Actually looks like your list rule  
starts with a MINUS, so you could just avoid going back past that.

Another way to handle this is to use the start/stop attributes of any  
rule reference to track the boundaries of a rule and then just print  
anything between it.  For example,

( list {print between $list.start and $list.stop;} )+

:)

Also note that $WORD will work as the attribute reference if it's  
unique in the alternative.

Terence


More information about the antlr-interest mailing list