[antlr-interest] Problem with lexical nondeterminism - ANTLR 2.7.7
Harald Mueller
harald_m_mueller at gmx.de
Fri Jan 4 02:32:40 PST 2008
> Harold your suggestion would work if it matches NUMBER
> first but it was actually an APAC_NUMERIC_TICKER. The match actually
> happens the other way around.
What? Maybe I misunderstand you - but the following code lexes all strings like 123,aa as APAC_N_T, whereas strings 1234 etc. are lexed as numbers (without any need for any predicates - just standard left-factoring).
Isn't that what you wanted??
Regards
Harald
------------------------------
header {
using System.IO;
}
options {
language = "CSharp";
}
class WordLexer extends Lexer;
tokens {
APAC_N_T;
}
{
public static void Main() {
using (TextReader tr = new StringReader("1 12 123 1234 12345 123456 1234567 1,aa 12,aa 123,aa 1234,aa 12345,aa 123456,aa 1234567,aa 123 aa")) {
WordLexer wl = new WordLexer(tr);
for (;;) {
IToken t = wl.nextToken();
if (t.Type == Token.EOF_TYPE) break;
switch (t.Type) {
case NUMBER: Console.Out.WriteLine(t.getText() + " -> NUMBER"); break;
case APAC_N_T: Console.Out.WriteLine(t.getText() + " -> APAC_N_T"); break;
case ID: Console.Out.WriteLine(t.getText() + " -> ID"); break;
default: Console.Out.WriteLine(t.getText() + " -> other"); break;
}
}
}
Console.In.Read();
}
}
NUMBER
: ('0'..'9')+
( ',' . . { _ttype = APAC_N_T; }
|
)
;
ID : ('a'..'z')+
;
WS : (' ' | '\n' | '\r')+ { _ttype = Token.SKIP; }
;
/* Result:
1 -> NUMBER
12 -> NUMBER
123 -> NUMBER
1234 -> NUMBER
12345 -> NUMBER
123456 -> NUMBER
1234567 -> NUMBER
1,aa -> APAC_N_T
12,aa -> APAC_N_T
123,aa -> APAC_N_T
1234,aa -> APAC_N_T
12345,aa -> APAC_N_T
123456,aa -> APAC_N_T
1234567,aa -> APAC_N_T
123 -> NUMBER
aa -> ID
*/
--
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
More information about the antlr-interest
mailing list