[antlr-interest] Newbie - How to count indentation level?
Juho Jussila
juho.jussila at iki.fi
Fri Jun 9 02:25:21 PDT 2006
Hi
I'm trying to parse following simple text and build AST.
------------------------------
E01
H01
H04
H05
H06
H06
H07
H02
H05
H03
H08
H81
H09
H22
------------------------------
AST should be like this:
Root
/ \
E01 H01 ...
/ | \
/ | \
H04 H05 H07 ...
/ \
H06 H06
I managed to a create grammar, but the problem is that max indentation
level is hard coded. Is there a way to make this more generic and
allow unlimited indentation level ?
------------------------------
class P extends Parser;
options {
buildAST=true;
k=4;
}
tab1 : TAB;
tab2 : tab1 TAB;
start : (level1)* { ## = #([ROOT,"Root"], ##); }
;
level1 :
TUNNUS^ newline! (level2)*
;
level2:
tab1! TUNNUS^ newline! (level3)*
;
level3:
tab2! TUNNUS newline!
;
newline:
NEWLINE | EOF
;
class L extends Lexer;
options {
caseSensitive = false;
}
protected LETTER: ('a'..'ö');
protected NUMBER: ('0'..'9');
TUNNUS: LETTER (LETTER|NUMBER)*;
NEWLINE
: '\r' '\n' // DOS
| '\n' // UNIX
{ newline(); };
WS : (' ') { $setType(Token.SKIP); };
TAB : '\t';
------------------------------
Another attempt:
------------------------------
...
start : (level1)* { ## = #([ROOT,"Root"], ##); }
;
level[int i]
{ int count = 0; }
:
TUNNUS^ newline!
( { count < (i+1) }?
TAB
{ count++; }
)*
({ count == (i+1) }? (level[i+1]))*
;
...
------------------------------
But it doesn't work. Result in XML-format:
<Root>
<E01/>
<H01>
<H04/>
<H05>
<H06/>
<H06/>
<H07/>
<H02/>
<H05/>
<H03/>
<H08/>
<H81/>
<H09/>
<H22/>
</H05>
</H01>
</Root>
--
Thanks in advance
Juho Jussila
More information about the antlr-interest
mailing list