[antlr-interest] How to set imaginary token text?

Randall R Schulz rschulz at sonic.net
Tue Jul 17 06:08:26 PDT 2007


On Monday 16 July 2007 22:13, Vaclav Barta wrote:
> On Monday 16 July 2007 21:20, Randall R Schulz wrote:
> > On Monday 16 July 2007 12:04, Vaclav Barta wrote:
> > > Experimenting some more, maybe I'd like to parse (some of) these
> > > characters individually but consolidate them into one AST node -
> > > something like
> >
> > Let me clarify that it is at the lexical level that a
> > token-per-character approach incurs potentially excessive overhead.
> > For example, a whitespace rule that matched single white-space
> > characters vs. one that collected them together could make a large
> > difference in
>
> Well, I'm not tokenizing whitespace characters individually. String
> characters may well run into thousands, but what's a few thousand
> objects between friends?

You had a non-fragment lexer rule whose right-hand-side was a single dot 
(any-character wildcard). This does indeed create a single token for 
each character it matches. That was what prompted my original 
statement.


Randall Schulz


More information about the antlr-interest mailing list