[antlr-interest] Semantic Predicates in a Lexer

"Paul Bouché (NSN)" paul.bouche at nsn.com
Fri Mar 20 03:23:21 PDT 2009


Hi,

I am asking the list because I am out of options, that is I still have 
an option but it is ugly.

My problem is that depending on context I want to lex the same sequence 
of characters differently. I have a NAME token which may contain a 
colon. But if set previously I want to distinguish the first NAME token 
into SIMPLENAME COLON SIMPLENAME . Therefore I introduced a SIMPLENAME 
token, a COLON token and a gated semantic predicate for both. The usual 
functionality of the lexer is that if a one token defintion is a subset 
of another but the subset deifntion comes first this is not a problem 
and will always win, i.e. the look ahead will predict the subset 
alternative if it comes first, if it does not come first a token can 
never be reached error will be emitted upon grammar generation, right?

The book says that a gated semantic predicate effectively disables an 
alternative and the look ahead will not see it, but this is not true. In 
my case the DFA still predicts the SIMPLENAME token, even though the 
semantic predicate is disabled and even worse I get a failed semantic 
predicate exception. Ok I thought predicated token definitions seem to 
be handled extra, so I need to cover the NAME definition with the 
negation of the predicate. Did so worked fine. But now, I have other 
token definitions like NUMBER which are a subset of NAME. Now NAME will 
always win because of the predicate - this is problematic because it 
disables the nice feature I described above. Of course I could now 
predicate all NAME subset token definitions, but this is really ugly - 
any other solutions?

Here is a lexer excerpt:
NUMBER : DIGIT_+;
SIMPLENAME: {noColonInNames}?=> LETTER_+;
COLON: {noColonInNames}?=> COLON_;
NAME: {!noColonInNames}?=> (LETTER_ | COLON_)+;
fragment DIGIT_: '0'..'9';
fragment LETTER_: 'a'..'z' | 'A'..'Z';

Thanks,
Paul

-- 
Paul Bouché
Voice: +49 30 590080-1284
 
Nokia Siemens Networks GmbH & Co. KG, An den Treptowers 1, 12435 Berlin, Germany
Sitz der Gesellschaft: München / Registered office: Munich
Registergericht: München / Commercial registry: Munich, HRA 88537
WEEE-Reg.-Nr.: DE 52984304

Persönlich haftende Gesellschafterin / General Partner: Nokia Siemens Networks Management GmbH
Geschäftsleitung / Board of Directors: Lydia Sommer, Olaf Horsthemke
Vorsitzender des Aufsichtsrats / Chairman of supervisory board: Lauri Kivinen
Sitz der Gesellschaft: München / Registered office: Munich
Registergericht: München / Commercial registry: Munich, HRB 163416



More information about the antlr-interest mailing list