[antlr-interest] pipelined lexing

Rolf Schumacher mailinglist at august.de
Sun Apr 18 19:17:28 PDT 2004


I would like to scan strings like:

aabaacaaaac

There I would like to find
1. all ac's
2. all aa's provided that ac's were consumed
3. a's and b's provided that all double char strings were consumed

In tokens: AA  B  A  AC  AA  A  AC

my solution works but looks a bit ugly to me (about unmaintainable).
Any better idea?

Rolf

-------------solution:

class TParser extends Parser;
all: ( AC | AA | A | B )* ;

class TLexer extends Lexer;
options { k = 3; }

WS: ( ' ' | '\n' {newline();}) {_ttype = Token.SKIP;};
AC: "ac";
AA: { LA(3)!='c' }? "aa";
A:  { ( LA(2)!='a' && LA(2)!='c' ) || ( LA(2) == 'a' && LA(3) == 'c' ) 
}? 'a';
B: 'b';


--------------

p.s.
Briefly I looked into TokenStreams.
Might that be possibility if performance is not an issue but maintenance?
Or let me put it that way:
Am I able to define a grammar in order to produce filters?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 1730 bytes
Desc: S/MIME Cryptographic Signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20040419/0dfa4767/smime.bin


More information about the antlr-interest mailing list