[antlr-interest] Understanding priorities in lexing (newbie)

mail.acc at freenet.de mail.acc at freenet.de
Wed Jul 11 22:59:00 PDT 2007


Hi,

I am trying to write a stand alone lexer
which can cope with arbitrary input without
reporting a "missmatched token"-error.

It should however recognize some combination
as Tokens.

This is my first approach:
--------------------------------------------
lexer grammar LexerJava;
KEYWORDA : 'int'|'float';
KEYWORDB : 'public'|'static'|'void';
COMMENT  : '/*' ( options {greedy=false;} : . )* '*/'
    	 | '//' ~('n'|'r')* 'r'? 'n'
	 ;
// fallback rule
ELSE	 :.;
--------------------------------------------

On an input like the following it reports
several errors:
--------------------------------------------
01: public class Test {
02:     private int varclassTmp = 3;
03:     [...]
04:     /* Comment */
05:     public static void main(String[] av) {
06:          float i=0;
07:          float[] sum; // comment
08:          int tmp;
09:          [...]
10:          float internationalization = 4.;
11:          /* int float */
12:     }
13: }
14: /* Comment */
--------------------------------------------
line 1:17 mismatched character ' ' expecting 'a'
line 5:24 mismatched character '(' expecting 't'
line 5:30 mismatched character 'g' expecting 't'

In some sense I am able to relate these errors,
because every time a KEYWORD seem to match
(Test->static; main->int; Strin->int) but I
can not figure out why rule ELSE doesen't match
in these cases.


In adition to these errors the KEYWORDA alternative
'int' matches in line 10 the first three chars of
internationalization. Which is not intended.

I thought I can cope with the latter problem in
augmenting the ELSE rule.
Intermezzo: In Lex/JFlex there is
something called the "maximal-munch" which
basically tells the lexer that the longest match
has superior priority, and if the match has the
same length the order is deciding - I know that
in ANTLR the order is also deciding, but read
nothing about other techniques yet.

Anyway, I tried to enhance rule ELSE with a star.
But even in greedy=false mode I always get an
Java exception
(java.lang.OutOfMemoryError: Java heap space).


I would be grateful if anyone could give me a hint
what I am doing wrong.

Best wishes from Germany
Andreas



"Jetzt Handykosten senken mit klarmobil - 14 Ct./Min.! Hier klicken"
http://produkte.shopping.freenet.de/handy_voip_isdn/klarmobil/index.html?pid=730025



More information about the antlr-interest mailing list