[antlr-interest] Having trouble with line numbers in ML_COMMENTS
Alex Shneyderman
a.shneyderman at gmail.com
Thu Mar 22 01:41:17 PDT 2007
On 3/22/07, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 19:39 22/03/2007, Alex Shneyderman wrote:
> >the problem I am seeing is taht the whenever my source has one
> of
> >those the line numbering is one lees it should be. If I have two
> >ML_COMMENTS the numbering is off by 2 and so on. I have a dirty
> >fix for it like so:
> [...]
> >but I can not understand why the original version is not working
>
> >correctly.
> >And of course my dirty fix will fail miserably when some one
> codes
> >like so:
> >
> >1. /* ml comment on one line */ int i = 0;
> >
> >my int i will be on the second line.
> >
> >So, I wonder if anyone can explain it and suggest what to do?
> Are you sure all your other rules containing newline characters
> call newline() similarly? In particular, how does your main
> newline/whitespace rule look? Possibly you've forgotten a set of
> brackets or something so it's not doing what you think it's doing.
Well the rules are not mine :-) as I said the .g file is taken from
antlr site and I have not done much tinkering with it except for
inclusions of things into the AST that I need for my project that were
otherwise ommited with the ! notation.
The particular problem with ML_COMMENT I think I understand it now.
The problematic bit of the rule:
ML_COMMENT
: "/*" ~('*')
( options {
generateAmbigWarnings=false;
}
:
{ LA(2)!='/' }? '*'
| '\r' '\n' {newline();}
| '\r' {newline();}
| '\n' {newline();}
| ~('*'|'\n'|'\r')
)*
"*/"
{$setType(Token.SKIP);}
;
is this match on line two:
: "/*" ~('*')
so if one has a comment like this:
1. /*\n
2. *\n
3. */\n
4. int i = 0;
where \n is a new line. The line number of int i = 0; is 3. What
happens here is that when ~('*') is looked up and \n is found this
part of the rule matches but \n is swallowed without there being a
call to newline(); To test my theory I added an extra space like so:
1. /* \n
2. *\n
3. */\n
4. int i = 0;
note the extra space on the first line. And now the line number of int
i = 0; is 4. Because ~('*
) now matches the space and the subsequent part of the rule will match
'\n' and do the newline();
Anyway, I just took a look at the grammar that is published on the web site,
http://www.antlr.org/grammar/1090713067533/java15.g instead of the one
that comes with the src distribution, and it differs. In this
particular rule ~(*) is removed :-) and it works.
More information about the antlr-interest
mailing list