[antlr-interest] dfa-based lexers versus top-down antlr lexers

Pete Forman pete.forman at westerngeco.com
Thu May 1 01:55:50 PDT 2003


At 2003-04-29 09:48 -0700, Terence Parr wrote:
>Concerning the interface between parser / lexer, yes: it's simple:
>
>interface TokenStream {
>         public Token nextToken();
>}

Would it be worth developing a lexer with a second method?

     public void unget(int numTokens);

This would allow lexer switching from within the parser.  The first
lexer could push back the input it has consumed allowing the second
lexer to make a clean start.

The semantics of the method are closest to unget() in C++ istream.
There is little point in doing putback(Token) as in C++ istream or
unread(Token) as in Java PushbackReader as the Token will not in
itself contain enough information to reconstruct the input to the
lexer.

Implementation would probably require extra buffering at the input
to the lexer to support backtracking.  Most of the lexers that I've
deployed are working on string inputs which would suit this proposal.
Rather more work would be needed for non random access file inputs.
The amount of buffering probably ought to be limited explicitly with
unget() throwing an exception if too much backtracking is attempted.

Another approach to implementation would be to make Token a bit
heavier.  It might carry the array of characters that formed it,
including those of adjacent skipped tokens.  Then unread(Token) would
contain enough info to back up the input to the lexer using putback()
or unread().  An intermediate approach would have a buffer in the
lexer or its input and store a pair of indexes in the token.

One benefit mentioned is allowing lexer switching in the parser.
Another might be to allow switching to a different lexer when a parse
error is detected to allow more powerful recovery from that error.

-- 
Pete Forman                -./\.-  Disclaimer: This post is originated
WesternGeco                  -./\.-   by myself and does not represent
pete.forman at westerngeco.com    -./\.-   opinion of Schlumberger, Baker
http://petef.port5.com           -./\.-   Hughes or their divisions.


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list