[antlr-interest] frustrated with lexer

Sat Sep 6 15:28:28 PDT 2003

Thanks Matt.  Power to the people! ;)

Terence
PS  Loring/Monty/I have been discussing AST stuff quite a bit and 
Loring's great ideas/implementation should appear in a 2.8 release so 
we can experiment with it before ANTLR 3 design locks in.

On Saturday, September 6, 2003, at 03:21 PM, Matthew Ford wrote:

> Hi Ter,
> I think Antlr is great as is (although I would not object to the AST 
> tree
> parser syntax being cleaned up as per my suggestions).
>
> If netminka finds the Antlr lexer frustrating he should try writing 
> one by
> hand using a state machine or recursive decent.
>
> Writing one by hand puts things in perspective and makes the mountains 
> in
> Antlr look like the mole hills they really are.
>
> I really like Antlr because it gives me fine control over the error 
> messages
> generated and how they are handled.  I like the ease with which I can 
> add
> dead ends just to give a more precise error message.
>
> matthew
>
> ----- Original Message -----
> From: "Terence Parr" <parrt at cs.usfca.edu>
> To: <antlr-interest at yahoogroups.com>
> Cc: <netminka at netscape.net>
> Sent: Sunday, September 07, 2003 5:09 AM
> Subject: Re: [antlr-interest] frustrated with lexer
>
>
>> On Tuesday, September 2, 2003, at 10:59 AM, netminka at netscape.net 
>> wrote:
>>> The latest example:
>>> I sometimes need to scan ahead through the input and once I've
>>> determined the context or whatever, push back what I've scanned onto
>>> the input stream. I DON'T NEED to push back everything but the first
>>> character!
>>> Which seems to be the consume() default.
>>>
>>> How is this consume default changed? Example please!
>>
>> override consume() ?
>>
>>> Here is the specific situation:
>>> END
>>>     : ("End" LINE_TERMINATOR) => ENDEXIT
>>>     | ("End" (' ' | '\t')+) => ENDCHECK
>>>     ;
>>>
>>> In the case that "End" followed by the above stuff is not recognized
>>> (e.g. the string 'EndTest') the lexer consumes the 'E' and I'm left
>>> with
>>> 'ndTest'. Note my ENDCHECK and ENDEXIT are protected.
>>
>> I am having trouble parsing your English sentences, but I'll take a
>> stab at this.  Please try without the second syn pred; it is 
>> redundant.
>>   If the first fails, it will go to the second.
>>
>> You can also try good old left-factoring:
>>
>> END : "End" ( ENDEXIT | ENDCHECK ) ;
>>
>> no fuss no muss.
>>
>>> I also don't like
>>> the hoisting of rules in nextToken based on left hand side semantic
>>> predicates; the effects are unpredictable and overly complex.
>>
>> Really?  The rule is: "if there is a predicate on the left edge of a
>> rule w/o an alternative, it uses that boolean test to turn the rule
>> on/off."  You'll have to tell me what you don't understand so I can
>> explain it better.
>>
>>> The ordering
>>> of matching rules is overly complex and thus unpredictable as well.
>>
>> Well, technically there is no ordering that matters if I can remember
>> correctly.  I do any sorting by lookahead depth that is required.
>>
>> Anyway, all that said, I agree that ANTLR's lexers are wacky.  I've 
>> got
>> the solution (or the engine to the solution) built for ANTLR 3.  :)
>>
>> Ter
>> --
>> Professor Comp. Sci., University of San Francisco
>> Creator, ANTLR Parser Generator, http://www.antlr.org
>> Co-founder, http://www.jguru.com
>> Co-founder, http://www.knowspam.net enjoy email again!
>> Co-founder, http://www.peerscope.com pure link sharing
>>
>>
>>
>>
>>
>>
>> Your use of Yahoo! Groups is subject to 
>> http://docs.yahoo.com/info/terms/
>>
>>
>
>
>
>
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/
>
>
>
--
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Co-founder, http://www.jguru.com
Co-founder, http://www.knowspam.net enjoy email again!
Co-founder, http://www.peerscope.com pure link sharing

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/