[antlr-interest] Comment rule matches links
Gavin Lambert
antlr at mirality.co.nz
Mon Aug 25 12:01:43 PDT 2008
At 04:15 26/08/2008, Jenny Balfer wrote:
>I am using the "standard" rule for single line comments:
>
>COMMENT : '//' (options {greedy=false;}: .)* ('\n'|'r')
> { skip(); }
> ;
>
>This works pretty well, until I have things like that in my
code:
>
>aString = "http://someUrl.com";
>
>Because the url contains two slashes, the lexer treats
everything
>from then on as a comment and skips the rest of the line; only
>aString = "http: remains.
>
>I tried to fight this problem by adding a rule that matches
every
>string before the comment rule:
>
>STRING : '"' (options {greedy=false;} .)* '"'
> ;
>
>This temporarily solved the problem, but brought up further
issues,
>so I would really appreciate to get along without it. Does
anyone
>have a better solution to prevent my lexer from skipping urls
just
>because they contain slashes?
Using a STRING rule is probably the best way to do this. (And not
just for this sort of problem -- generally you want strings to be
recognised as single entities anyway, instead of random sequences
of other tokens, and you need to preserve whitespace.)
While it might be possible to ignore //s within quotes via other
means (eg. semantic predicates), it'd be quite painful and would
still give you malformed strings in other cases.
What kind of "further issues" is it causing?
More information about the antlr-interest
mailing list