[antlr-interest] Another parsing question
Foust
javafoust at gmail.com
Wed Aug 6 14:17:36 PDT 2008
Could you implement FOLLOW using a hybrid approach?
(BitSets for non-UNICODE ranges + algorithmic tests for data
outside that range? Many DSLs are still non-UNICODE, other
than strings, which use ASCII delimiters.)
B
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Gavin Lambert
> Sent: Wednesday, August 06, 2008 12:59 PM
> To: Loring Craymer; Terence Parr; antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Another parsing question
>
> At 02:10 7/08/2008, Loring Craymer wrote:
> >I think that much of this discussion would be moot if ANTLR 3
> >lexers had the capabilities of ANTLR 2 lexers; unfortunately,
> >that requires an efficient way of doing FOLLOW sets for unicode
> >ranges--and no solution has yet presented itself for that.
>
> Can't you just use an algorithmic test (similar to how sempreds
> work)? Obviously the standard table/bitset-based solution won't
> work for Unicode (at least not without generating very large
> bitsets [and by "very large" I mean that to represent the full
> UTF-32 range would require a bitset taking 512MB... and that's
> just a single follow set]) Whereas expressing the same thing in
> code should be much more compact, since there are likely to be
> large contiguous ranges. And it'd have the added bonus of being
> more readable, too.
>
> (Of course, ANTLR might still need to hold that 512MB bitset in
> memory while compiling the grammar, depending on how it works the
> set out -- and possibly more than one. But this should be less of
> a burden than trying to do it at runtime for every rule.)
More information about the antlr-interest
mailing list