[antlr-interest] Antlr 3 and the newline token problem

Micheal J open.zone at virgin.net
Fri Nov 25 12:58:20 PST 2005


Please don't post HTML mail to the list. 
 
Me and Terence were recently having a discussion about this.
Its about how to handle newlines in antlr 3.
  
Now as you probably know that currently ANTLR 2 cant handle all 3 types of
newlines.
ie, if we have a rule like this-

WS : '\r' '\n' {newline();}
       | '\r'    {newline();}
       | '\n'   {newline();}
     ;

we would get a non determinism warning.

The reason this problem arises is solely because currently we have chosen to
store 'lines & columns' in tokens instead of offsets.
 
No. The reason is that we have to count newlines in action code. It can
perhaps be done behind the scenes.
 
I mean, think about it this way, if we didnt have to put that newline()
call, we could easily write this rule as-

WS : '\r' | '\n' ;

This would handle all 3 types of newlines. 
 
But we would lose the accurate line count ability.

So i propose that in antlr 3 you identify the position of the tokens by
offset instead of 'line/columns'

This has the following advantages - 
 
<SNIP>
 
On the other hand Terence suggests that call to newline() can be put inside
the CharBuffer class where it is handled automatically so people who need to
track line nos can do so easily.
This would be nice but then again it increases the complexity if we decide
to keep both offsets and row/cols. 
 
Not really. Can't the newline calculation be abstracted?. If a newline count
isn't required, no code should be run that calculates it.
 
Which approach do you think would be best? 
 
An approach where I can decide if I want one or the other. Perhaps by
setting a grammar property.

If you guys would like we can put up a poll for this on the ANTLR Studio
forum.

That what we have a list for... ;-)
 
Cheers,
 
Micheal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20051125/cab3f7d9/attachment.html


More information about the antlr-interest mailing list