[antlr-interest] Fun with ANTLR3: mystery of the huge lexer

Sat Jun 30 17:55:54 PDT 2007

On Saturday 30 June 2007 17:30, David Piepgrass wrote:
> ...
> > User operators eh, how do they specify precedence?
>
> With numbers. In fact, they can even have a different precedence on
> the left and right sides of a binary operator, as well as making
> ternary and other complex operators. I'm still developing the
> algorithm but I think it will work.

Two things:

1) I'm intrigued by user-defined or -extensible grammars. Over on the 
Groovy list we touched on dynamic / user-specified grammars recently in 
the context of adding rule- or logic-based programming capabilities to 
the language (and / or its runtime / libraries), and I'm a somewhat 
familiar with Prolog's operator definition and user-specified grammar 
facilities.

If this isn't something proprietary, could you point me to information 
on the language you're dealing with?

2) I don't know why you want to avoid lexer fragments. The way I think 
about them, and someone will, I hope, correct me if I'm wrong about 
this, they're like macros. They don't stand alone, but are "expanded" 
in-line in other lexer rules that reference them.

Seen this way, they're just a way to avoid replicating regular 
expressions. And, again if I'm not mistaken, any number of references 
to ANTLR lexer fragments are indistinguishable from simply cutting and 
pasting the right-hand-sides of those fragments in place of the 
references to them.

It seems like a real win. You don't introduce or eliminate any 
ambiguities and there is no effect on lexer complexity or efficiency. 
You just avoid replication of REs. Given the "write-only" nature of 
REs, that seems like a very good thing.

Randall Schulz