[antlr-interest] Comment rule matches links

Gavin Lambert antlr at mirality.co.nz
Mon Aug 25 12:01:43 PDT 2008

At 04:15 26/08/2008, Jenny Balfer wrote:
 >I am using the "standard" rule for single line comments:
 >COMMENT : '//' (options {greedy=false;}: .)* ('\n'|'r')
 >          { skip(); }
 >        ;
 >This works pretty well, until I have things like that in my 
 >aString = "http://someUrl.com";
 >Because the url contains two slashes, the lexer treats 
 >from then on as a comment and skips the rest of the line; only
 >aString = "http: remains.
 >I tried to fight this problem by adding a rule that matches 
 >string before the comment rule:
 >STRING : '"' (options {greedy=false;} .)* '"'
 >       ;
 >This temporarily solved the problem, but brought up further 
 >so I would really appreciate to get along without it. Does 
 >have a better solution to prevent my lexer from skipping urls 
 >because they contain slashes?

Using a STRING rule is probably the best way to do this.  (And not 
just for this sort of problem -- generally you want strings to be 
recognised as single entities anyway, instead of random sequences 
of other tokens, and you need to preserve whitespace.)

While it might be possible to ignore //s within quotes via other 
means (eg. semantic predicates), it'd be quite painful and would 
still give you malformed strings in other cases.

What kind of "further issues" is it causing?

More information about the antlr-interest mailing list