[antlr-interest] frustrated with lexer

Sat Sep 6 15:21:53 PDT 2003

Hi Ter,
I think Antlr is great as is (although I would not object to the AST tree
parser syntax being cleaned up as per my suggestions).

If netminka finds the Antlr lexer frustrating he should try writing one by
hand using a state machine or recursive decent.

Writing one by hand puts things in perspective and makes the mountains in
Antlr look like the mole hills they really are.

I really like Antlr because it gives me fine control over the error messages
generated and how they are handled.  I like the ease with which I can add
dead ends just to give a more precise error message.

matthew

----- Original Message ----- 
From: "Terence Parr" <parrt at cs.usfca.edu>
To: <antlr-interest at yahoogroups.com>
Cc: <netminka at netscape.net>
Sent: Sunday, September 07, 2003 5:09 AM
Subject: Re: [antlr-interest] frustrated with lexer

> On Tuesday, September 2, 2003, at 10:59 AM, netminka at netscape.net wrote:
> > The latest example:
> > I sometimes need to scan ahead through the input and once I've
> > determined the context or whatever, push back what I've scanned onto
> > the input stream. I DON'T NEED to push back everything but the first
> > character!
> > Which seems to be the consume() default.
> >
> > How is this consume default changed? Example please!
>
> override consume() ?
>
> > Here is the specific situation:
> > END
> >     : ("End" LINE_TERMINATOR) => ENDEXIT
> >     | ("End" (' ' | '\t')+) => ENDCHECK
> >     ;
> >
> > In the case that "End" followed by the above stuff is not recognized
> > (e.g. the string 'EndTest') the lexer consumes the 'E' and I'm left
> > with
> > 'ndTest'. Note my ENDCHECK and ENDEXIT are protected.
>
> I am having trouble parsing your English sentences, but I'll take a
> stab at this.  Please try without the second syn pred; it is redundant.
>   If the first fails, it will go to the second.
>
> You can also try good old left-factoring:
>
> END : "End" ( ENDEXIT | ENDCHECK ) ;
>
> no fuss no muss.
>
> > I also don't like
> > the hoisting of rules in nextToken based on left hand side semantic
> > predicates; the effects are unpredictable and overly complex.
>
> Really?  The rule is: "if there is a predicate on the left edge of a
> rule w/o an alternative, it uses that boolean test to turn the rule
> on/off."  You'll have to tell me what you don't understand so I can
> explain it better.
>
> > The ordering
> > of matching rules is overly complex and thus unpredictable as well.
>
> Well, technically there is no ordering that matters if I can remember
> correctly.  I do any sorting by lookahead depth that is required.
>
> Anyway, all that said, I agree that ANTLR's lexers are wacky.  I've got
> the solution (or the engine to the solution) built for ANTLR 3.  :)
>
> Ter
> --
> Professor Comp. Sci., University of San Francisco
> Creator, ANTLR Parser Generator, http://www.antlr.org
> Co-founder, http://www.jguru.com
> Co-founder, http://www.knowspam.net enjoy email again!
> Co-founder, http://www.peerscope.com pure link sharing
>
>
>
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/