[antlr-interest] Rob Pike on writing a lexer in Go for a template language

Wed Aug 31 21:33:59 PDT 2011

Terence,

Can you point me a some more details about "traditional lex-like
system including lexical modes in v4"

I'm don't know enough about scannerless parsering, are there details
on how it support island grammars?

Thanks for the response.
Regards
Gary

> ... traditional lex-like system including lexical modes in v4.
>
> To deal with the island grammars and context-sensitive Lexing issues, I'm supporting scannerless parsers in v4.
>
> The reason I wrote the StringTemplate lexer by hand was partially due to the island grammar issue, but primarily because the delimiters can be dynamically specified by the user. That tends to defeat any automata-based systems for lexing.
>
> Ter
>
> On Aug 30, 2011, at 7:40 PM, Gary Miller wrote:
>
>> Hey All,
>>
>> Slightly off topic post, I thought there might be some interest.
>>
>> Last night I went to a talk by Rob Pike of Google, you can watch the talk at
>>   https://www.youtube.com/watch?v=HxaD_trXwRE&feature=player_embedded
>>
>> Before I went my thinking was that this could probably be knocked up
>> in ANTLR in a few minutes, but then ....
>> All the uncomfort I have with ANTLR lexering came back to me.
>> So I though I'd go to the source and have a look at the lexer for ST,
>> and low and behold ST's lexer is written by hand.
>> Now I'm feeling quite uncomfortable about ANTLR's lexing.
>>
>> I think it basically comes down to the stateless nature of the ANTLR lexing.
>> Not the first time context-sensitive scanning has been mentioned on
>> the list (*).
>> Yes I know that it can be made statefull (*) and/or I can push more
>> onto the parser, but both of these have issues.
>> Statefull ANTLR lexing code I generally find more confusing and harder
>> to write then functionally equivalent code in a target language.
>> Pushing more into the parser in this particular case is inefficient as
>> there are large chunks of text that doesn't need to be tokenized and
>> there is the issue the whitespace tokens might need to behaving
>> differently in different places (hidden verse not).
>>
>>
>> Started off as an off topic post, ended as a rant about lexing regards
>> Gary
>> P.S. I've started on the ANTLR target for Go, still very immature.
>> https://github.com/millergarym/antlr/tree/,
>>
>> * Scott Stanchfield's context-sensitive scanning
>> http://javadude.com/articles/antlr-context-sensitive-scanner.html
>>
>> * a good example of this is Jim's numerical lexing for JavaFX
>> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>