[antlr-interest] frustrated with lexer
parrt at cs.usfca.edu
Sat Sep 6 15:28:28 PDT 2003
Thanks Matt. Power to the people! ;)
PS Loring/Monty/I have been discussing AST stuff quite a bit and
Loring's great ideas/implementation should appear in a 2.8 release so
we can experiment with it before ANTLR 3 design locks in.
On Saturday, September 6, 2003, at 03:21 PM, Matthew Ford wrote:
> Hi Ter,
> I think Antlr is great as is (although I would not object to the AST
> parser syntax being cleaned up as per my suggestions).
> If netminka finds the Antlr lexer frustrating he should try writing
> one by
> hand using a state machine or recursive decent.
> Writing one by hand puts things in perspective and makes the mountains
> Antlr look like the mole hills they really are.
> I really like Antlr because it gives me fine control over the error
> generated and how they are handled. I like the ease with which I can
> dead ends just to give a more precise error message.
> ----- Original Message -----
> From: "Terence Parr" <parrt at cs.usfca.edu>
> To: <antlr-interest at yahoogroups.com>
> Cc: <netminka at netscape.net>
> Sent: Sunday, September 07, 2003 5:09 AM
> Subject: Re: [antlr-interest] frustrated with lexer
>> On Tuesday, September 2, 2003, at 10:59 AM, netminka at netscape.net
>>> The latest example:
>>> I sometimes need to scan ahead through the input and once I've
>>> determined the context or whatever, push back what I've scanned onto
>>> the input stream. I DON'T NEED to push back everything but the first
>>> Which seems to be the consume() default.
>>> How is this consume default changed? Example please!
>> override consume() ?
>>> Here is the specific situation:
>>> : ("End" LINE_TERMINATOR) => ENDEXIT
>>> | ("End" (' ' | '\t')+) => ENDCHECK
>>> In the case that "End" followed by the above stuff is not recognized
>>> (e.g. the string 'EndTest') the lexer consumes the 'E' and I'm left
>>> 'ndTest'. Note my ENDCHECK and ENDEXIT are protected.
>> I am having trouble parsing your English sentences, but I'll take a
>> stab at this. Please try without the second syn pred; it is
>> If the first fails, it will go to the second.
>> You can also try good old left-factoring:
>> END : "End" ( ENDEXIT | ENDCHECK ) ;
>> no fuss no muss.
>>> I also don't like
>>> the hoisting of rules in nextToken based on left hand side semantic
>>> predicates; the effects are unpredictable and overly complex.
>> Really? The rule is: "if there is a predicate on the left edge of a
>> rule w/o an alternative, it uses that boolean test to turn the rule
>> on/off." You'll have to tell me what you don't understand so I can
>> explain it better.
>>> The ordering
>>> of matching rules is overly complex and thus unpredictable as well.
>> Well, technically there is no ordering that matters if I can remember
>> correctly. I do any sorting by lookahead depth that is required.
>> Anyway, all that said, I agree that ANTLR's lexers are wacky. I've
>> the solution (or the engine to the solution) built for ANTLR 3. :)
>> Professor Comp. Sci., University of San Francisco
>> Creator, ANTLR Parser Generator, http://www.antlr.org
>> Co-founder, http://www.jguru.com
>> Co-founder, http://www.knowspam.net enjoy email again!
>> Co-founder, http://www.peerscope.com pure link sharing
>> Your use of Yahoo! Groups is subject to
> Your use of Yahoo! Groups is subject to
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Co-founder, http://www.knowspam.net enjoy email again!
Co-founder, http://www.peerscope.com pure link sharing
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest