[antlr-interest] Lexer too quick to grab a token?
Bart Kiers
bkiers at gmail.com
Mon May 2 05:50:11 PDT 2011
On Mon, May 2, 2011 at 1:19 AM, Todd O'Bryan <toddobryan at gmail.com> wrote:
> ...
>
>
> Does this make any sense? Is there some way to deal with it?
> ...
You could let '/]]' be matched in the 'R_TAG' rule and emit another token as
per the instructions described here:
http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497
A demo:
lexer grammar TLexer;
@members {
List<Token> tokens = new ArrayList<Token>();
private void emit(String text, int type) {
Token token = new CommonToken(type, text);
token.setType(type);
emit(token);
}
@Override
public void emit(Token token) {
state.token = token;
tokens.add(token);
}
@Override
public Token nextToken() {
super.nextToken();
if(tokens.size() == 0) {
return Token.EOF_TOKEN;
}
return (Token)tokens.remove(0);
}
}
L_TAG
: '[/'
;
R_TAG
: '/]]' {emit("/", ANY); emit("]]", R_BRACKET);}
| '/]'
;
L_BRACKET
: '[['
;
R_BRACKET
: ']]'
;
SPACE
: (' ' | '\t' | '\r' | '\n') {skip();}
;
ANY
: .
;
which can be tested with the class:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String source = "[/ foo /] [[/ bar /]]";
ANTLRStringStream in = new ANTLRStringStream(source);
TLexer lexer = new TLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
for(Object o : tokens.getTokens()) {
Token t = (Token)o;
System.out.println("text=" + t.getText() + ", type=" + t.getType());
}
}
}
Regards,
Bart.
More information about the antlr-interest
mailing list