[antlr-interest] help for URL grammar

Kirby Bohling kirby.bohling at gmail.com
Tue Mar 22 01:52:18 PDT 2011


Pretty sure the generated tokens for the '-' in toplabel and
domainlabel are conflicting.  Try creating a '-' token (or potentially
a fragment).  If you are using literal text, I'd only use that as part
of a lexer rule, never a parser rule like you are in those two rules.

By the by, I don't think that the first portion of a domain name has
to start with a letter.  See 23andme.com as an example (while that
redirects you to www.23andme.com, the RFCs for URLs don't require
starting with a letter (no dashes at the beginning from a quick read
of Wikipedia, but that is all I saw, no limits on digits being first).

(Ignoring whether creating an ANTLR grammar for URLs is redundant, or
a good use of the tool).

Kirby


On Tue, Mar 22, 2011 at 2:49 AM, Christian CORMIER
<cormier at u-picardie.fr> wrote:
> Hi,
> I test https://github.com/yaojingguo/antlr-url-grammar.
> domainlabel: alphanum ((alphanum | '-')* alphanum)?;
> don't recognize something like www.u-picardie.fr using interpreter of
> last version of ANTLRworks (1.4.2)
> But for www.uu-picardie.fr is OK.
> Regards
> Christian CORMIER
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list