[antlr-interest] How to set imaginary token text?
Randall R Schulz
rschulz at sonic.net
Tue Jul 17 06:08:26 PDT 2007
On Monday 16 July 2007 22:13, Vaclav Barta wrote:
> On Monday 16 July 2007 21:20, Randall R Schulz wrote:
> > On Monday 16 July 2007 12:04, Vaclav Barta wrote:
> > > Experimenting some more, maybe I'd like to parse (some of) these
> > > characters individually but consolidate them into one AST node -
> > > something like
> >
> > Let me clarify that it is at the lexical level that a
> > token-per-character approach incurs potentially excessive overhead.
> > For example, a whitespace rule that matched single white-space
> > characters vs. one that collected them together could make a large
> > difference in
>
> Well, I'm not tokenizing whitespace characters individually. String
> characters may well run into thousands, but what's a few thousand
> objects between friends?
You had a non-fragment lexer rule whose right-hand-side was a single dot
(any-character wildcard). This does indeed create a single token for
each character it matches. That was what prompted my original
statement.
Randall Schulz
More information about the antlr-interest
mailing list