[antlr-interest] v2->v3 Skip chars in Lexer? For C-target [SOLVED 2.5]
Ruslan Zasukhin
ruslan_zasukhin at valentina-db.com
Sun Apr 17 05:37:47 PDT 2011
Hi All,
After Jim points to more effective way skip wrapper-quotes,
And some more time, this is working solution for archive:
//--------------------------------------------------------------------
IDENT
: ( LETTER | '_' ) ( LETTER | '_' | DIGIT )*
;
// RZ 04/17/11: in ANTLR v3 there is no way skip chars in lexer. Oops.
// Instead we do trick suggest by Jim Idle on ANTLR list:
// skip first/last chras of token on the parser level.
//
DELIMITED // delimited_identifier
:
( DQUOTE ( ~(DQUOTE) | DQUOTE DQUOTE )+ DQUOTE
| BQUOTE ( ~(BQUOTE) | BQUOTE BQUOTE )+ BQUOTE
| LBRACK ( ~(']') )+ RBRACK
)
;
And on the parser level, we use Token and its pointers to ++ / --
Also type of Token is changed to IDENT with help of re-write.
//--------------------------------------------------------------------
identifier
: IDENT // regular_identifier
| d=DELIMITED // delimited_identifier
{
++$d->start;
--$d->stop;
}
-> ^( IDENT[$d.text->chars] )
;
================
Works... But ...
I am far not sure that this solution is really more effective, Jim.
Yes, on lexer level I have use ->chars, and you say it is slower ...
But on parser level, except to fast ++ / -- operations, we need yet create
second token IDENT and copy all values from the first ...
Sizeof( ANTLR3_COMMON_TOKEN_struct) is about 160-200 bytes.
So creation by new and copy about 150 bytes to skip TWO chars
not looks so cheap operation. Also note that IDENTs usually 5-20 chars
only. Much less of 200 bytes of that structure.
And may be my first solution with Lexer level was not so bad?
And I still have TODO: skip chars inside of LITERAL on parser level ...
here we cannot do just ++ \ --
================
I do not see yet the whole picture how works lexer on low level in C.
Also I do not see yet any clean information about UTF encodings in C-target.
I am going ask about this in future letters.
--
Best regards,
Ruslan Zasukhin
VP Engineering and New Technology
Paradigma Software, Inc
Valentina - Joining Worlds of Information
http://www.paradigmasoft.com
[I feel the need: the need for speed]
More information about the antlr-interest
mailing list