[antlr-interest] Lexer problem - previous token semantic predicate
Silvester Pozarnik
silvester.pozarnik at tracetracker.com
Fri Apr 4 03:29:02 PDT 2008
Hi,
I have a problem in forcing the lexer to emit the right token type,
based on previous token type encountered.
The string I want to parse is for example "TRD.2ads". This now generate
token sequence (SYS_TRD, REAL_LITERAL) while I need it to generate
(SYS_TRD, DOT, IDENTIFIER) sequence (see the fragments from the grammar
below). The logic should be that if lexer previously encountered a
SYS_TRD token, do not interpret the DOT as a start of REAL_LITERAL, but
as a DOT itself.
I tried to use semantic predicate "{ input.index()>0 && input.LT(-1) !=
'D' }?" at the commented marker below ( instead of /* code here? */) but
the result is that recognition of REAL_LITERAL gets aborted, but lexer
fails to generate a DOT token.
I will really appreciate if there is someone in antlr community with any
idea how to solve this.
Thanks
Silvester Pozarnik
// ... parser part
tokens {...
SYS_TRD='TRD';
//...
}
//...
trd_property returns [String value] :
SYS_TRD! DOT! {input.LT(1).setType(IDENTIFIER);} property { $value =
$property.text; }
;
//..
// lexer parts
DOT : '.' ;
//...
IDENTIFIER
: { testLiterals=true; } ('a'..'z' | 'A'..'Z' | '0'..'9' |
'\u0080'..'\ufffe') ( Letter | Digit)*
;
//...
fragment
REAL_LITERAL
: ('0'..'9')+ '.' ('0'..'9')* Exponent?
| { /* code here? */ }? '.' ('0'..'9')+ Exponent?
| ('0'..'9')+ Exponent
;
//...
More information about the antlr-interest
mailing list