[antlr-interest] Comment rule matches links
Jenny Balfer
ai06087 at Lehre.BA-Stuttgart.De
Tue Aug 26 01:24:10 PDT 2008
> >I am using the "standard" rule for single line comments:
> >
> >COMMENT : '//' (options {greedy=false;}: .)* ('\n'|'r')
> > { skip(); }
> > ;
> >
> >This works pretty well, until I have things like that in my
> code:
> >
> >aString = "http://someUrl.com";
> >
> >Because the url contains two slashes, the lexer treats
> everything
> >from then on as a comment and skips the rest of the line; only
> >aString = "http: remains.
> >
> >I tried to fight this problem by adding a rule that matches
> every
> >string before the comment rule:
> >
> >STRING : '"' (options {greedy=false;} .)* '"'
> > ;
> >
> >This temporarily solved the problem, but brought up further
> issues,
> >so I would really appreciate to get along without it. Does
> anyone
> >have a better solution to prevent my lexer from skipping urls
> just
> >because they contain slashes?
>
> Using a STRING rule is probably the best way to do this. (And not
> just for this sort of problem -- generally you want strings to be
> recognised as single entities anyway, instead of random sequences
> of other tokens, and you need to preserve whitespace.)
>
> While it might be possible to ignore //s within quotes via other
> means (eg. semantic predicates), it'd be quite painful and would
> still give you malformed strings in other cases.
>
> What kind of "further issues" is it causing?
My problem are regular expressions that match quotes, like this one:
replace(/"/, """);
In this case, the STRING rule matches everything from the first to the
second quote, which is "/, ", and then takes everything beginning from the
last quote sign to any further one.
I already found the article about island grammars
(http://www.antlr.org/wiki/display/ANTLR3/Island+Grammars+Under+Parser+Control),
but I have no idea how I can apply this solution for my problem, for the
workaround is for parser grammars and my STRING / COMMENT rules are still
part of the lexer.
More information about the antlr-interest
mailing list