[antlr-interest] Why don't parsers support character ranges?

Peter Nann peter.nann at vecommerce.com.au
Wed Apr 23 18:06:42 PDT 2008


 
Randall R Schulz wrote:
> If you got that stuff in CS101,

OK, Maybe it was CS201... But you know what I mean.

> Show the ANTLR principals wrong by besting them at their own game. If 
> you drop the sour-puss act, they'll probably wish you well, even help 
> you, and certainly congratulate you if you succeed.

I think the problem was that my task was quite simple.
I think ANTLR makes hard things easier (has many cool features for
that), but in my case my simple task didn't turn out to be a simple
solution in ANTLR.

No tool is perfect, and (almost?) to tool can maintain a linear
relationship between problem complexity and solution complexity, I get
that.

So, let's just put it down to bad luck for my specific requirements.
ANTLR does look like an awesome tool for a very broad range of more
complex problems, and I'll leave it at that!


Thanks for your well put response(s).


-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Randall R Schulz
Sent: Thursday, 24 April 2008 10:56 AM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Why don't parsers support character
ranges?

On Wednesday 23 April 2008 17:01, Peter Nann wrote:
> Hmmm, I was hoping for more than the 'efficiency' argument...
> I am wondering if that argument is about 10 years past its use-by 
> date...
> We are not in the days of single-digit-Megahertz and RAM measured in k

> anymore... when lexx and yacc were written...

Well, ANTLR goes well beyond lex and yacc. However, if you believe that
lexer / parser stratification is no longer justified, you could set out
to prove that thesis by writing a unified lexer / parser generator tool.
(That does everything current tools do!) Many good current parser
generators are open source (including ANTLR, of course), so you can
exploit the techniques they use and that you like and replace or improve
the ones you don't.

Personally, I'm not sure stratifying the lexical and syntactic analysis
is a bad thing. I've certainly never found it to be a problem, and I've
written my share of parsers, using lex & yacc (or flex and bison, I
guess), JavaCC, ANTLR 2.x and 3.x. The only thing I don't care for is
the use of alphabetic case to distinguish lexical from syntactical
rules.


> It would depend on the scale of parsing you need to do of course, but 
> for small-scale parsing I would question whether CPU and RAM matters 
> any more on that task...

You know, there's a reason we don't call them "little languages" any
more. They are never little and they never were little! And while it's
legitimate to make a considered choice about trading off, say, developer
time and execution time, it's not really OK to do something slowly when
you don't get something in turn for it.


> I will have to take your word about 'combinatorial explosion' for some

> problems, but I thought simple RDP's could pretty much break down to 
> one branch (as in: switch statement) per character (or token if you 
> tokenize it), which doesn't seem excessive, or combinatorial.

You may still want to produce a DFA, and that can in general yield and
exponential increase in the number of states. Not stratifying the
lexical and syntactic layers will exacerbate that problem (I think).

And I don't have any idea about the consequences of unifying lexical
analysis with syntax analysis in the face of arbitrary or variable
look-ahead or backtracking.

Lastly, I still think lexical states (as they exist in JavaCC, e.g.)
would be a good thing. It seems that would be harder to do when the
lexer is not separated from the parser.


>  - But, yes, that was just my CS101 project!

Interesting. If you got that stuff in CS101, you must have gotten one
hell of a CS education.


> ...
>
> Sorry to be a sour-puss, but I was quite excited about ANTLR at first
> look, but then got disappointed very quickly, so I'm a bit like a
> child who just broke his favourite toy...  ;-)

Show the ANTLR principals wrong by besting them at their own game. If 
you drop the sour-puss act, they'll probably wish you well, even help 
you, and certainly congratulate you if you succeed.


Randall Schulz


More information about the antlr-interest mailing list