[antlr-interest] How to ignore tokens during lexingh

Tue Apr 28 01:20:40 PDT 2009

With your grammar, the lexer will eat everything that looks like a token.

When parsing URLs, I ran into the following problem. It's a big
grammar with many other concerns, but finally it worked by defining
every single letter as a token, and define words as rules (your
"metadata" would become a rule).
I couldn't make character ranges work, but maybe I did something wrong.
http://github.com/caillette/novelang/blob/67c453c11a8ab45263d004c22c8b2ab2713768a7/src/antlr/Novelang.g

On Tue, Apr 28, 2009 at 7:35 AM, James Robson <james.robson at ymail.com> wrote:
> Hi,
>
> I've put together a grammar to parse a URN.
>
> In the grammar I have some tokens defined, however while processing one of
> the parser rules for an arbitary string it exits the parse and attempts to
> match one of the tokens based on the first two letters, this is regardless
> of the look ahead specified or if i split out the grammar and turn on
> filters.
>
> as an example one of the tokens defined is "metadata" - this is defined in
> the tokens section.
>
> while parsing the urn and parsing the parameters section, it runs into
> "myname=james" which should match name EQUALS value. Instead it attemts to
> match 'me=' aginst metadata and failes and gives me instad mynajames as the
> result of the name EQUALS value rule.
>
> Having the tokens defined as tokens or lexer rules makes no difference, not
> that I really expected it to.
>
> Maybe I'm taking a totally wrong approach here, can anyone provde some
> advice?
>
> Thanks in advance