[antlr-interest] Recovering white space in V3.0
Terence Parr
parrt at cs.usfca.edu
Sat Jun 4 17:07:01 PDT 2005
On Jun 4, 2005, at 4:12 PM, Bryan Ewbank wrote:
> Ter,
>
> Can you define "common" and "extreme" in this context?
Sure. Common: buffer up all tokens (Note that in the early 90's
PCCTS did this for syntactic predicates). Make tweaking the input
stream and spitting it back out mostly verbatim easy. Extreme:
parsing something bigger than the 2G RAM I have in my box ;)
Some of the stuff is more heavyweight than you'd want in a really
speed-critical app. For example, my common tokens store the token
index because it's damn useful. They also track indexes into the
char buffer (start/stop of the token string) rather than build
strings...requires the chars be buffered too. The tokens store the
char position in the line (column) as well as the line. All this
takes memory to store and time in the lexer to set.
I experimented returning the same exact token object for all
whitespace and comments just to see if it saved much in speed.
Didn't notice much but it's hard to measure as you know. Point is,
you can do anything you want. I'm just making it really easy to whip
together some cool translators. If you need to handle extremely
large files or need extreme speed, you can do it--you just have to do
a wee bit of work for it.
For example, you can copy the Java.stg template file and tweak it for
speed (very easily done) and then just keep that around forever so
you can use it. Say language=MyJava in the grammar options and boom--
it uses your faster code generator :)
Does that help? More details?
Ter
>
> On 6/4/05, Terence Parr <parrt at cs.usfca.edu> wrote:
>
>> I am building stuff in general to work for the common
>> case not the extremes, leaving the ability to handle
>> the extremes.
>
More information about the antlr-interest
mailing list