[antlr-interest] ANTLR 3.0.1: invalid character column in a mismatch character error message.

Francis ANDRE francis.andre.kampbell at orange.fr
Thu Aug 14 09:46:45 PDT 2008

Kay Röpke a écrit :
> If fact, I strongly believe tabs to be supremely evil and they should 
> be first up against the wall when the revolution comes ;)
Sorry Key, but I strongly vote in favour of tabs within source code and 
processing them at the editor or whatever text processor(which ANTLR is 
not, you're right).
But tab permits to adjust the typology of the source code to your habits 
and usage for a better understanding of it....(hu a strong debate about 
style/indentation in perspective)..
But at least, if someone is used to tab at 4 spaces and another one at 2 
spaces and a third at 6...all of them will be happy when looking at 
their code through their favourite text editor while the binary text 
will be *exactly* the same... (even on mainframe, the classical l ISPF 
editor handles tab properly!!) So be happy with tabs ;)



> Seriously though, ANTLR correctly reports the _character_ position 
> (disregarding the 0 vs 1 debate for the moment), because a \t is _one_ 
> character. When you are dealing with text in any UI library I've seen, 
> tabs are represented as one character in the underlying text storage, 
> to avoid having you to deal with all this trouble of what the effect 
> of tabs on the screen is. It's up to other layers to figure out the 
> actual layout. We should do likewise.
> I already see the next guy writing a syntax highlighter coming along 
> and complain about ANTLR expanding tabs to spaces so that for input 
> like "\tID" we report the start index of token ID as being 8 (or 9 if 
> someone insists on charPosInLine to be 1-based), assuming that 
> "standard tab width" is 8. If written in sloppy C that could easily 
> crash his application, and in any other language it would at least 
> cause an exception of some sort.
> That's the fundamental reason I'm so strongly against handling tabs in 
> any special way.
> The grammar author is of course free to generate special whitespace 
> tokens for different kind of whitespace in case he needs to somehow 
> disambiguate them later on.

More information about the antlr-interest mailing list