[antlr-interest] antlr3 lexer: recognize single character token

Bart Kiers bkiers at gmail.com
Wed Jun 6 07:26:27 PDT 2012


Hi Verny,

Why wouldn't the "A" from "Azzz" be tokenized as an A-token? It matches it,
after all.
If "Azzz" should be matched differently, you must define a lexer rule that
matches it, like this:

    ID : ('a'..'z' | 'A'..'Z')+;

With that rule added, "Azzz" would be tokenized as an ID-token, no matter
if the A-rule is defined before the ID-rule (ANTLR's lexer gives precedence
over rules that match more characters, only when two (or more) rules match
the same amount of characters, the rules define first will "win").

Regards,

Bart.


On Wed, Jun 6, 2012 at 4:14 PM, Verny Quartara <webny23 at gmail.com> wrote:

> Hi everybody,
> can someone explain why if I define a lexer rule like this:
>
> A : 'A';
>
> The lexer recognize "A" as valid, but also "A" followed by any character,
> for example "Azzz" or "AK" are considered valid tokens??
>
> This is driving me crazy, I red the documentation but at the moment I just
> can't understand.
> Thanks
>
>
>
> --
> Verny Quartara
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list