[antlr-interest] Re: feature request: Token.getOffset()

cj_daly cj_daly at yahoo.com
Sun Dec 7 02:32:13 PST 2003


I see how this could work by overriding tab() and newline(), but I was 
thinking tracking absolute offset could be generally useful - so why 
not put it into the codebase.  And I doubt that adding an int and 
incrementing it in a couple of places (where column is currently 
changed in CharScanner) is going to affect performance or 
maintainability.

Here's another angle: isn't offset a more fundamental measure than 
line/column to begin with?  I mean your input source could be bits, 
bytes, chars, nodes or whatever and line/column may not have any
meaning in some of those cases, but offset is your way of tracking
a token back to its place in the input source.

just my 2 bits...

--- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> wrote:
> How about: override tab() to keep a correction value for column 
> information, and override newline() to track offset for the start of 
> the current line.  Then you can compute the character offset 
> yourself: (line start offset + column - correction) should work 
> using the token's column information since the correction only 
> changes at tabs.
> 
> Adding more state to the lexer is something that is better avoided.
> 
> --Loring
> 
> 
> 
> --- In antlr-interest at yahoogroups.com, "cj_daly" <cj_daly at y...> 
> wrote:
> > Hi Antlr Maintainers,
> > 
> > For my purposes currently it's more important to have the absolute
> > offset into the input file for each token than to have the
> > line/column.  To get what I want I've been calling
> > 
> > lexer.setColumn(0);
> > lexer.setTabSize(1);
> > 
> > before the parse and then calls to getColumn() return the offset I
> > need.  But this means I never call newline() because that would 
> reset
> > the column counter and thus I can't have line/column info if I want
> > it.
> > 
> > I think that it would be nice and easy have it both ways.  We would
> > just need to add getOffset() and setOffset() to Token and then have
> > LexerSharedInputState manage an offset counter independently of the
> > line/column counters.
> > 
> > Does that make sense?  Am I totally missing something here (i.e. is
> > the offset info I need already available somewhere)?
> > 
> > 
> > Chris


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list