[antlr-interest] ANTLR3C 3.4 Usage of Parser

Paxi coder at paxi.at
Fri Jul 13 01:12:32 PDT 2012


Im currently working on a on project where my goal is to normalize an
arbitrary JS code.
For short explonantation: I want to parse an arbitrary JS code and do some
things with it:
• Rename every variable (f.e rename /var wadbwdb/ into /var1/ and rename it
again with the right name if it occurs again)
• Evaluate assignment expressions at any given time: f.e if there a
statement like /asdf = wadwdb+”0x18”;/ I want to add the value from wadwdb +
0x18 and save it to asdf in a struct that stores information about every
variable
• Check for functions 
• Some more…
Afterwards fill the manipulated JS code in a streambuffer and move on.
 
Now to my actual question and problem:
Im using the following JS grammar
http://www.antlr.org/grammar/1206736738015/JavaScript.g (with options
/language=C; and ASTLabelType = pANTLR3_BASE_TREE;/  added)
/ANTLRWorks 1.4.3/ for building the C files and 
/libantlr3c-3.4/ for developement
 
Im working on this for about a week now and I just don’t get how to use the
parser correctly.
First of all when generating the .c files with ANTLR3Works there was already
one problem in the parser:
In file JavaScriptParser.h there is following statement generated  /#define
LT 18/
Than in JavaScriptParser.c following is generated   
/#undef LT 
#define LT(n) INPUT->_LT(INPUT, n)/
So LT is undefined and redefined as makro LT(n) but not as constant value
18. This may be a bug but I just replaced every LT with constant value 18
and left the makro in the .c file and got everything to compile than. So
thats not the problem anymore.

My real problem anyways is that I don’t know how to use the parser correctly
or there is just something wrong with it here.
What I wanted to do for now is using the parser to get all statement within
the program that always end with a ; of course
The code I got so far looks like this:
/typedef JavaScriptParser_statement_return *PStatement;
PStatement stat;
while(stat = &(m_pParser->statement(m_pParser))) {
       if(stat->start->type == ANTLR3_TOKEN_EOF || !stat->tree->children) 
             break;
 
       std::cout << stat->start->index << " " << stat->stop->index << " " <<
stat->tree->children->count << "\n" << std::endl;
}
 /
Only the first stat struct i is valid and even there im not sure about
because my first statement f.e is 
/var dnivg9=this; and the second like 
var
ktatf="ne%00~a`e0uc8uddu8cudbu4du7eufeu61u12u00u00u00u53u46u64u52u10u20u10ua1u9au2aua1uf5u12u00u00…..”;/
and there are about 6 more statements coming after it
 
The output is following: /0 5 4/ (children count = 4 is already false I
guess)
                                /6 5 0/
And that’s it. So first there should be some more statement and after the
first struct the tree is always invalid and has no childnodes. I assumed
that the JS grammar im using can also be used for building an AST because
someone else used the same grammar in C# and parsed the AST. But since the
tokens values are also wrong when using the parser there has to be something
wrong.
 
So what im looking for is a general guide how to parse through the whole
stream.
 
I can go through every single token just within the tokenstream and
everything is fine like this :
/while(pToken =
m_pParser->pParser->tstream->_LT(m_pParser->pParser->tstream,1)) {
       if(pToken->type!=ANTLR3_TOKEN_EOF) {
             // do smth
}
      
m_pParser->pParser->tstream->istream->consume(m_pParser->pParser->tstream->istream);
                    
}// end loop/
 
But than I have to do lots of testing and checking types myself.
My goal or hope was that I can just tell ANTLR to start parsing at the
“program” token, and than parse down the tree and get single expressions or
assignement etc within.
I would really need some help on this. Thanks in advance.



Regards Christian


--
View this message in context: http://antlr.1301665.n2.nabble.com/ANTLR3C-3-4-Usage-of-Parser-tp7578377.html
Sent from the ANTLR mailing list archive at Nabble.com.


More information about the antlr-interest mailing list