[antlr-interest] Why don't parsers support character ranges?

Eamon Nerbonne eamon at nerbonne.org
Thu Apr 24 01:20:01 PDT 2008


The suggestion has been made that parsers and lexers need to be separated,
or that combining them would be a subject for a thesis.  This isn't new
work, however: there exist such things, like say...

DParser - Scannerless GLR parser
With juicy bits like:
"[...]The grammar can be ambiguous, right or left recursive, have any number
of null productions, and because there is no seperate tokenizer, can include
whitespace in terminals and have terminals which are prefixes of other
terminals.[...]"
http://dparser.sourceforge.net/

GLR techniques can actually parse ambiguous constructs, which is a boon when
it comes to things like C's dangling else.  I'm not an expert, but IIRC
these parsers are all descendants of Tomito's GLR parser, which was (almost)
capable of parsing ambiguous constructs but contained a few errors (with
nullable items and hidden left recursion) and wasn't very efficient on
ambiguous grammars.  Newer algorithms have fixed both issues.

I believe scannerless parsing is possible with the a little more well known
Elkhound, and a bit of web-searching also revealed meta-environment
http://www.cwi.nl/htbin/sen1/twiki/bin/view/Meta-Environment, and there's
lots more out there.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080424/e7e42c15/attachment.html 


More information about the antlr-interest mailing list