[antlr-interest] (follow up) setting, altering text in lexer rules

Loring Craymer craymer at warpiv.com
Mon Jun 12 14:11:17 PDT 2006


 

Ter--

 

I wouldn't go back to not being able to edit (via !) in the lexer--that
would be a step back from ANTLR 2.  Think back to the early examples that
showed how powerful the LL(k) lexers with were, especially with the editing
support.

 

What can probably be done is to make the support machinery conditional--if a
lexer rule has a ! in it, then do the StringBuffer thing (or keep character
arrays for altered tokens), else just track the ends of the token.
StringTemplate can handle this, and it may not even be that messy.

 

It should also be remembered that "getText()" in the parser will require
String construction.  For an application that does a lot of text processing,
editing in the lexer is not additional overhead.  For typical applications,
it is added overhead.  However, the typical lexer editing is to remove
quotes from STRINGs; that is usually an optimization from the application's
standpoint.

 

--Loring

 

> -----Original Message-----

> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-

> bounces at antlr.org] On Behalf Of Terence Parr

> Sent: Monday, June 12, 2006 12:31 PM

> To: ANTLR Interest

> Subject: Re: [antlr-interest] (follow up) setting, altering text in lexer

> rules

> 

> ok, so lexing time when from 1340ms to 2095ms when I added all this

> crap in there.  That is more than 1.5x the cost in time.  I can't

> justify that 50% increase in lexer time.  Also the parser is taking

> more time...weird...a GC issue?

> 

> Wow. the

> 

> text.setLength(0);

> 

> in the nextToken() method costs 200ms out of that 2095ms.  When I

> remove all this machinery it goes back to what it was in my notes

> time-wise so it's not a fluke.  Rats!

> 

> Ok, I propose that we take a big step back and say "you can set the

> text for the token manually".  You get a setText() method and the

> auto mechanism will see your altered text if nonnull.  If you want to

> build up a token piecemeal you must do so manually.  So you'd do this:

> 

> ESC : '\\' 'n' {setText("\n");} ;

> 

> I still need to spend time inc/dec the rule level though so I know

> when to emit a token.  It seems to cost a wee bit but that is ok I

> guess.

> 

> Ter

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060612/ee2cd537/attachment.html


More information about the antlr-interest mailing list