[antlr-interest] frustrated with lexer
Terence Parr
parrt at cs.usfca.edu
Sat Sep 6 15:28:28 PDT 2003
Thanks Matt. Power to the people! ;)
Terence
PS Loring/Monty/I have been discussing AST stuff quite a bit and
Loring's great ideas/implementation should appear in a 2.8 release so
we can experiment with it before ANTLR 3 design locks in.
On Saturday, September 6, 2003, at 03:21 PM, Matthew Ford wrote:
> Hi Ter,
> I think Antlr is great as is (although I would not object to the AST
> tree
> parser syntax being cleaned up as per my suggestions).
>
> If netminka finds the Antlr lexer frustrating he should try writing
> one by
> hand using a state machine or recursive decent.
>
> Writing one by hand puts things in perspective and makes the mountains
> in
> Antlr look like the mole hills they really are.
>
> I really like Antlr because it gives me fine control over the error
> messages
> generated and how they are handled. I like the ease with which I can
> add
> dead ends just to give a more precise error message.
>
> matthew
>
> ----- Original Message -----
> From: "Terence Parr" <parrt at cs.usfca.edu>
> To: <antlr-interest at yahoogroups.com>
> Cc: <netminka at netscape.net>
> Sent: Sunday, September 07, 2003 5:09 AM
> Subject: Re: [antlr-interest] frustrated with lexer
>
>
>> On Tuesday, September 2, 2003, at 10:59 AM, netminka at netscape.net
>> wrote:
>>> The latest example:
>>> I sometimes need to scan ahead through the input and once I've
>>> determined the context or whatever, push back what I've scanned onto
>>> the input stream. I DON'T NEED to push back everything but the first
>>> character!
>>> Which seems to be the consume() default.
>>>
>>> How is this consume default changed? Example please!
>>
>> override consume() ?
>>
>>> Here is the specific situation:
>>> END
>>> : ("End" LINE_TERMINATOR) => ENDEXIT
>>> | ("End" (' ' | '\t')+) => ENDCHECK
>>> ;
>>>
>>> In the case that "End" followed by the above stuff is not recognized
>>> (e.g. the string 'EndTest') the lexer consumes the 'E' and I'm left
>>> with
>>> 'ndTest'. Note my ENDCHECK and ENDEXIT are protected.
>>
>> I am having trouble parsing your English sentences, but I'll take a
>> stab at this. Please try without the second syn pred; it is
>> redundant.
>> If the first fails, it will go to the second.
>>
>> You can also try good old left-factoring:
>>
>> END : "End" ( ENDEXIT | ENDCHECK ) ;
>>
>> no fuss no muss.
>>
>>> I also don't like
>>> the hoisting of rules in nextToken based on left hand side semantic
>>> predicates; the effects are unpredictable and overly complex.
>>
>> Really? The rule is: "if there is a predicate on the left edge of a
>> rule w/o an alternative, it uses that boolean test to turn the rule
>> on/off." You'll have to tell me what you don't understand so I can
>> explain it better.
>>
>>> The ordering
>>> of matching rules is overly complex and thus unpredictable as well.
>>
>> Well, technically there is no ordering that matters if I can remember
>> correctly. I do any sorting by lookahead depth that is required.
>>
>> Anyway, all that said, I agree that ANTLR's lexers are wacky. I've
>> got
>> the solution (or the engine to the solution) built for ANTLR 3. :)
>>
>> Ter
>> --
>> Professor Comp. Sci., University of San Francisco
>> Creator, ANTLR Parser Generator, http://www.antlr.org
>> Co-founder, http://www.jguru.com
>> Co-founder, http://www.knowspam.net enjoy email again!
>> Co-founder, http://www.peerscope.com pure link sharing
>>
>>
>>
>>
>>
>>
>> Your use of Yahoo! Groups is subject to
>> http://docs.yahoo.com/info/terms/
>>
>>
>
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
>
--
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Co-founder, http://www.jguru.com
Co-founder, http://www.knowspam.net enjoy email again!
Co-founder, http://www.peerscope.com pure link sharing
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list