[antlr-interest] ambigous lexer tokens

Thu Jun 28 08:10:34 PDT 2007

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> 
> > I would like to write a grammar for the following output:
> >
> >  drwxr-xr-x   23 tcurdt  tcurdt    782 Jun 24 22:54 ..
> >  -rw-r--r--    1 tcurdt  tcurdt  18545 Nov  1  2006
> > ASMContentHandler.Rule.html

> I think you have a number of options:
>

While I agree with this:

> 1. Given that many of the tokens look the same, don't try to
> differentiate between them in the lexer. Instead handle everything in
> the parser.
>

And this (unless it is just a learning exercise):

> 3. Don't use ANTLR for this task. The input is so limited and regular
> that it may be quicker to just write something by hand.
>

I cannot agree with this:

> ANTLR is a very complex tool and it can deviate from your expectations
in
> incredibly subtle and hard-to-understand ways.

But you cannot just go barging in on it - first you have to learn what
the expectations are. The tool is a complicated to thing to produce, but
once you have your head around a few concepts it is remarkably easy to
use.

Jim