[antlr-interest] Problems with spurious EOF tree nodes while working with a tree parser (update)
Stefan Mätje
Stefan.Maetje at esd-electronics.com
Fri Apr 20 00:38:39 PDT 2012
Hi ANTLR community,
perhaps I should ask my questions from another point of view.
Is it possible in a filtering tree parser to match simple sequences of
two or more tokens? How is this be done or may be it is impossible?
In my parser I have the following "statement" rule that parses common
statements and generates the AST:
statement
: (ID ':')* unlabeledStatement
-> (LBD_DCL ID)* unlabeledStatement
;
A statement like "lbl1: lbl2: GOTO somewhere;" would generate an AST
like this:
"LBL_DCL lbl1 LBL_DCL lbl2 'GOTO' somewhere"
with 'lbl1', 'lbl2' and 'somewhere' being ID nodes. Is it possible to
match these sequences like I tried with the tree grammar (filter mode)
rules below without interfering with this unexpected 'EOF' tree node?
Thanks for any help,
Stefan
Am 19.04.2012 20:26, schrieb Stefan Mätje:
> Hi,
>
> generate an AST with a combined lexer/parser grammar. Then I feed the
> generated AST via a CommonTreeNodeStream into a tree grammar to build up
> a symbol table. The tree grammar is in filter mode and I use my
> own tree nodes called Pearl90Tree. Therefore I created a custom
> Pearl90TreeAdaptor class.
>
> In my tree grammar I have two very simple rules quoted below:
>
> label_dcl
> // : LBL_DCL ID // Won't match
> : LBL_DCL EOF ID // Will match
> {
> dbgOut.println("-> Label at _",$ID.line, $ID.pos));
> }
> ;
>
> label_resolve
> // : gt='GOTO' id=ID // Won't match
> : gt='GOTO' eof=EOF myId=ID // Will match
> {
> dbgOut.println("GOTO_Label "+$myId.toString());
> dbgOut.println("EOF #"+$eof.serial);
> }
> ;
>
> The parser generates simply "LBL_DCL ID" for each label definition and a
> sequence of "'GOTO' ID" for a goto statement. I verified that the AST is
> correct. Also I dumped the CommonTreeNodeStream to see that it doesn't
> contain any EOF tree node behind the 'GOTO' or LBL_DCL tree node. My
> test source input is this:
>
> MODULE goto;
> PROBLEM;
> P: PROC;
> label: GOTO label;
> END;
> MODEND;
>
> The CommonTreeNodeStream dumped follows here:
>
> +++++++++++++++ Tree +++++++++++++++++++++
> Pearl90Tree node #-1, c:2; token type: 92 'MODULE', value: 'MODULE'
> Pearl90Tree node #-1, c:0; token type: 2 '<DOWN>', value: 'DOWN'
> Pearl90Tree node #0, c:0; token type: 73 'ID', value: 'goto'
> Pearl90Tree node #1, c:1; token type: 182 ''PROBLEM'', value: 'PROBLEM'
> Pearl90Tree node #-1, c:0; token type: 2 '<DOWN>', value: 'DOWN'
> Pearl90Tree node #0, c:3; token type: 115 'PROC_DCL', value: 'PROC_DCL'
> Pearl90Tree node #-1, c:0; token type: 2 '<DOWN>', value: 'DOWN'
> Pearl90Tree node #0, c:0; token type: 73 'ID', value: 'P'
> Pearl90Tree node #1, c:0; token type: 93 'MOD_LIST', value: 'MOD_LIST'
> Pearl90Tree node #2, c:4; token type: 27 'BODY', value: 'BODY'
> Pearl90Tree node #-1, c:0; token type: 2 '<DOWN>', value: 'DOWN'
> Pearl90Tree node #0, c:0; token type: 85 'LBL_DCL', value: 'LBL_DCL'
> Pearl90Tree node #1, c:0; token type: 73 'ID', value: 'label'
> Pearl90Tree node #2, c:0; token type: 77 'KW_GOTO', value: 'GOTO'
> Pearl90Tree node #3, c:0; token type: 73 'ID', value: 'label'
> Pearl90Tree node #-1, c:0; token type: 3 '<UP>', value: 'UP'
> Pearl90Tree node #-1, c:0; token type: 3 '<UP>', value: 'UP'
> Pearl90Tree node #-1, c:0; token type: 3 '<UP>', value: 'UP'
> Pearl90Tree node #-1, c:0; token type: 3 '<UP>', value: 'UP'
> Pearl90Tree node #-1, c:0; token type: -1 '>EOF<', value: 'EOF'
>
> What I can see is that the tree parser in the filter mode generates lots
> of UP, DOWN and EOF tree nodes. Apparently the tree parser stuffs some
> of these EOF nodes between the others. (I know this because the tree
> parser calls my Pearl90TreeAdaptor to generate these nodes.)
>
> Because of this added EOF it can only match to "'GOTO' EOF ID" which I
> believe to be very strange. Why doesn't it match the "LBL_DCL ID"
> sequence either?
>
> What am I doing wrong? Any suggestions to get this running without
> putting this "magic" EOF in between?
>
> Thanks in advance,
> Stefan
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list