[antlr-interest] Newbie - How to count indentation level?

Juho Jussila juho.jussila at iki.fi
Fri Jun 9 02:25:21 PDT 2006


Hi

I'm trying to parse following simple text and build AST.

------------------------------
E01
H01
	H04
	H05
		H06
		H06
	H07
	H02
		H05
		H03
H08
H81
H09
	H22
------------------------------

AST should be like this:

          Root
          / \
       E01  H01  ...
           / | \ 
          /  |  \
        H04 H05 H07 ...
            / \
          H06 H06 


I managed to a create grammar, but the problem is that max indentation
level is hard coded. Is there a way to make this more generic and
allow unlimited indentation level ?

------------------------------
class P extends Parser;
options { 
    buildAST=true; 
    k=4;
}
tab1 : TAB;
tab2 : tab1 TAB;

start : (level1)* { ## = #([ROOT,"Root"], ##); }
     ;
level1 : 
        TUNNUS^ newline! (level2)*
        ;
level2:
        tab1! TUNNUS^ newline! (level3)*
        ;
level3:
        tab2! TUNNUS newline!
        ;
newline:
        NEWLINE | EOF
        ;


class L extends Lexer;
options {
    caseSensitive = false;
}
protected LETTER: ('a'..'ö');
protected NUMBER: ('0'..'9');
TUNNUS:     LETTER (LETTER|NUMBER)*;
NEWLINE
    :   '\r' '\n'    // DOS
    |   '\n'        // UNIX    
    { newline(); };
WS  :   (' ') { $setType(Token.SKIP); }; 
TAB : '\t';
------------------------------


Another attempt:
------------------------------
...
start : (level1)* { ## = #([ROOT,"Root"], ##); }
     ;

level[int i]
{ int count = 0; }
    :
        TUNNUS^ newline!
        ( { count < (i+1) }? 
            TAB
            { count++; }     
        )* 
        ({ count == (i+1) }? (level[i+1]))*
    ;
...
------------------------------

But it doesn't work. Result in XML-format:

<Root>
  <E01/>
  <H01>
    <H04/>
    <H05>
      <H06/>
      <H06/>
      <H07/>
      <H02/>
      <H05/>
      <H03/>
      <H08/>
      <H81/>
      <H09/>
      <H22/>
    </H05>
  </H01>
</Root>


-- 
Thanks in advance

Juho Jussila



More information about the antlr-interest mailing list