[antlr-interest] Another parsing question
Gavin Lambert
antlr at mirality.co.nz
Wed Aug 6 12:58:42 PDT 2008
At 02:10 7/08/2008, Loring Craymer wrote:
>I think that much of this discussion would be moot if ANTLR 3
>lexers had the capabilities of ANTLR 2 lexers; unfortunately,
>that requires an efficient way of doing FOLLOW sets for unicode
>ranges--and no solution has yet presented itself for that.
Can't you just use an algorithmic test (similar to how sempreds
work)? Obviously the standard table/bitset-based solution won't
work for Unicode (at least not without generating very large
bitsets [and by "very large" I mean that to represent the full
UTF-32 range would require a bitset taking 512MB... and that's
just a single follow set]) Whereas expressing the same thing in
code should be much more compact, since there are likely to be
large contiguous ranges. And it'd have the added bonus of being
more readable, too.
(Of course, ANTLR might still need to hold that 512MB bitset in
memory while compiling the grammar, depending on how it works the
set out -- and possibly more than one. But this should be less of
a burden than trying to do it at runtime for every rule.)
More information about the antlr-interest
mailing list