[antlr-interest] Rob Pike on writing a lexer in Go for a template language

Terence Parr parrt at cs.usfca.edu
Wed Aug 31 09:24:36 PDT 2011


Hi Gary,

thanks for the thoughts. I agree that ANTLR lexers have been annoying, which is why I've gone back to a traditional lex-like system including lexical modes in v4.

To deal with the island grammars and context-sensitive Lexing issues, I'm supporting scannerless parsers in v4.

The reason I wrote the StringTemplate lexer by hand was partially due to the island grammar issue, but primarily because the delimiters can be dynamically specified by the user. That tends to defeat any automata-based systems for lexing.

Ter

On Aug 30, 2011, at 7:40 PM, Gary Miller wrote:

> Hey All,
> 
> Slightly off topic post, I thought there might be some interest.
> 
> Last night I went to a talk by Rob Pike of Google, you can watch the talk at
>   https://www.youtube.com/watch?v=HxaD_trXwRE&feature=player_embedded
> 
> Before I went my thinking was that this could probably be knocked up
> in ANTLR in a few minutes, but then ....
> All the uncomfort I have with ANTLR lexering came back to me.
> So I though I'd go to the source and have a look at the lexer for ST,
> and low and behold ST's lexer is written by hand.
> Now I'm feeling quite uncomfortable about ANTLR's lexing.
> 
> I think it basically comes down to the stateless nature of the ANTLR lexing.
> Not the first time context-sensitive scanning has been mentioned on
> the list (*).
> Yes I know that it can be made statefull (*) and/or I can push more
> onto the parser, but both of these have issues.
> Statefull ANTLR lexing code I generally find more confusing and harder
> to write then functionally equivalent code in a target language.
> Pushing more into the parser in this particular case is inefficient as
> there are large chunks of text that doesn't need to be tokenized and
> there is the issue the whitespace tokens might need to behaving
> differently in different places (hidden verse not).
> 
> 
> Started off as an off topic post, ended as a rant about lexing regards
> Gary
> P.S. I've started on the ANTLR target for Go, still very immature.
> https://github.com/millergarym/antlr/tree/,
> 
> * Scott Stanchfield's context-sensitive scanning
> http://javadude.com/articles/antlr-context-sensitive-scanner.html
> 
> * a good example of this is Jim's numerical lexing for JavaFX
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list