[antlr-interest] Re: Skipping grammar

pwolleba pwolleba at yahoo.no
Wed Oct 8 03:36:02 PDT 2003


Hello!

I am starting to dominate this newsgroup with my problem, so I guess 
I have to stop after this post!
Anyway, I will paste some of my code from my parser and if you could 
find where I am thinking wrong I would appreciate if you could 
comment it!



PARSER
 
//---------------------------------------------- METHODE -------------
methodeNode         : (METHOD^) declarationName methodeDecleration 
methodBody;

methodeDecleration  : (LPAREN!) (methodArguments)? (RPAREN!)
                      {#methodeDecleration=#
([ARGUMENTS,"Arguments"],#methodeDecleration);};

methodArguments     : (methodArgument (COMMA! methodArguments)?);

methodArgument      : declarationName;

methodBody          : (METHOD_BODY)
                      {#methodBody=#
([EXPRESSION,"Expression"],#methodBody);};


LEX

METHOD_BODY : '{'! (BracedExpr | ~'}')* "};"!;

protected
BracedExpr : '{' (BracedExpr | ~'}')* "}";



FILE TO PARSE

Packet name{
Model name {
Method{
	Expressiontext;
	If/else and so on
};
};
};

As you can see the method is build up much like a method in both C++ 
or Java. What makes it difficult is the fact that I don't want to 
parse the method body text, I just want to consume it.

As you can see my Lex wont work, since it will react at both the 
Packet bracket as well as Model bracket. If I somehow could just make 
it start when it is a method I would be really happy.

Best regards,
Per




--- In antlr-interest at yahoogroups.com, "Anthony W Youngman" 
<Anthony.Youngman at E...> wrote:
> Hmmm ...
> 
> You should be able to declare that in the lexer.
> 
> method: lcurly method_body rcurly ;
> 
> protected method_body: name arguments expression ;
> 
> Do the curly brackets always indicate a method? If not, how do you 
tell
> whether it's the start of a method or the start of something else? 
If
> you can unabiguously identify the start of a method (eg it's 
flagged by
> an lcurly, which is the only use of an lcurly) then what you appear 
to
> want is pretty simple to achieve.
> 
> Solve the problem of how to identify "this is a method", and the 
rest of
> it should just fall into place. If the lexer can recognise "this is 
a
> method" then the lexer can handle methods for you. The parser will 
then
> build your tree for you the way you want it.
> 
> I think your original comment about ";" being used to terminate 
both IFs
> and methods is a red herring. Have you grasped why it's not a 
problem?
> If you have, then you should be able to work out the rest of the
> solution fairly easily. If you haven't, then you need to get that
> straight because it shows a fundamental misunderstanding of ANTLR. 
Don't
> forget, both the lexer and parser are recursive (they "drill 
down"), so
> context-dependent semantics shouldn't be a problem ...
> 
> Cheers,
> Wol
> 
> -----Original Message-----
> From: pwolleba [mailto:pwolleba at y...] 
> Sent: 08 October 2003 10:13
> To: antlr-interest at yahoogroups.com
> Subject: [antlr-interest] Re: Skipping grammar
> 
> 
> Hello again
> 
> Thanks for helping me out Arnar, your solutions are really good! 
> Still I think I will have problem implementing them, much because I 
> have not given you enough information. 
> I need to make a method tag in my tree that contains information, 
> such as arguments into the method and such (see example).
> 
> 
> Method testMethod (Args,Args....){
> 	Expression text
> }
> 
> method
> |
> |--------Name
> |
> |--------Arguments
> |
> |-------- Expression
> 
> 
> If I solve this in my lexer I will not be able to create this node 
> tree, it will just be one node method that contains all the text. 
If 
> I drop the "method"tag in my METHOD_BODY tag, it will trigger at 
all 
> the other bracket in my document.
> Can I somehow make my lexer rule without the "method" tag, and then 
> make it just trigger when I need the method body?
> 
> best regards,
> Per
> 
> --- In antlr-interest at yahoogroups.com, "Arnar Birgisson" 
> <arnarb at o...> wrote:
> > Hello Per,
> > 
> > Perhaps you could make "method {" a single token in the parser, 
and 
> set
> > the nestingLevel variable to zero when that one matches.
> > 
> > The solution I posted uses the parser to eat up the stuff inside 
> {...},
> > another possibility might be to make the lexer do this:
> > 
> > METHOD_BODY
> >   : "method"! '{'! ( BracedExpr | ~'}' )* "};"!
> >   ;
> > 
> > protected
> > BracedExpr
> >   : '{' ( BracedExpr | ~'}' )* "}"
> >   ;
> > 
> > Overall, this might be a better solution. The token METHOD_BODY 
will
> > then contain as it's text whatever was inside the {...}.
> > 
> > As a side note, this is possible in ANTLR lexers because the are 
LL
> (k)
> > and can thus handle context-free grammars. Conventional lexers are
> > limited to regular grammars (represented by regular expressions 
> which
> > are equivalent to finite automata) and can f.x. not match nested 
> braces,
> > parenthesis etc. See
> > http://www.antlr.org/doc/lexer.html#Predicated-LL(k)_Lexing for 
more
> > information on this.
> > 
> > Arnar
> > 
> > ps. yes, the "i" should have been "nestingLevel" :o)
> > pps. again, I haven't tried this, it might not even be 
syntactically
> > correct
> > 
> > >>> pwolleba at y... 10/07/03 5:34 PM >>>
> > Hello again!
> > 
> > I am looking at your example Arnar, and I have some questions. 
> > When I wrote my example I should have included some more 
> information. 
> > The methode node is inside of another node called member (see 
> > example) and it can be more than one!
> > 
> > Member{
> > Methode {
> > 	Sometext;
> > };
> > };
> > 
> > This makes your example a bit more difficult to implement, since 
> the 
> > counter will start a zero at the first bracket, which is the 
member 
> > bracket. I must somehow be able to set nestingLevel = 0 from the 
> > parser when the method node is starting.
> > How do I do that?
> > 
> > best regards,
> > Per
> > 
> > Ps: I guess it should be nestingLevel++ instead of i++. Correct?
> > 
> > --- In antlr-interest at yahoogroups.com, "pwolleba" <pwolleba at y...> 
> > wrote:
> > > Yes that is correct, what is inside the bracket is a different 
> > > language which I at the moment don't want to write a parser for 
> (it 
> > > is pretty complex and big). Anyway I have just come back to 
work, 
> > and 
> > > I am going to try out your solution Arnar, hopefully it will 
> work! 
> > > 
> > > I just want to thank the community for trying to find a 
solution 
> to 
> > > my question, and I must say it came really fast!
> > > 
> > > Best regards,
> > > 
> > > Per
> > > 
> > > 
> > > --- In antlr-interest at yahoogroups.com, "Arnar Birgisson" 
> > > <arnarb at o...> wrote:
> > > > Hi..
> > > > 
> > > > In my earlier post, I understood Per differently. I think he 
> > want's 
> > > to
> > > > parse "method name{ <whatever> };" and just eat up 
<whatever>, 
> > > including
> > > > any nested braces, and put it in a variable, completely 
without 
> > > lexing
> > > > and/or parsing it. Per, is this correct?
> > > > 
> > > > The result of all this being a tree something like this:
> > > > 
> > > > METHOD
> > > >  |
> > > > name-body
> > > > 
> > > > where the body node contains anything inside the {..} as it's 
> > text.
> > > > 
> > > > Arnar
> > > > 
> > > > >>> Anthony.Youngman at E... 10/07/03 1:33 PM >>>
> > > > I think you're missing the point. Define a ; as SEMI. The way 
> I'd 
> > > do it
> > > > (and this is all pseudocode) is
> > > > 
> > > > if_statement: "IF" lcurly (method)* rcurly "ELSE" lcurly 
> (method)*
> > > > rcurly SEMI ;
> > > > method: blah_blah SEMI ;
> > > > 
> > > > That way, the lexer doesn't care whether ; is ending a method 
> or 
> > an 
> > > if
> > > > clause, and the parser won't get confused because when it 
hits a
> > > > right-curly it will be expecting an ELSE or a SEMI, and not a 
> > > method.
> > > > And if the ELSE is optional you just mark it as such so when 
> the 
> > > parser
> > > > hits the right-curly after the if, it's expecting an ELSE or 
a 
> > SEMI 
> > > and
> > > > nothing else.
> > > > 
> > > > Cheers,
> > > > Wol
> > > > 
> > > > -----Original Message-----
> > > > From: pwolleba [mailto:pwolleba at y...] 
> > > > Sent: 07 October 2003 08:19
> > > > To: antlr-interest at yahoogroups.com
> > > > Subject: [antlr-interest] Skipping grammar
> > > > 
> > > > 
> > > > I am pretty new to ANTLR so maybe this question is very 
> trivial, 
> > if 
> > > > so even better then maybe it is a simple solution to my 
> problem. 
> > > > Anyway I am struggling with writing a new parser in ANTLR to 
> > > replace 
> > > > and old implementation in Flex/Bison, this to make a product 
> that 
> > > are 
> > > > open for implementation from both C++ as well as Java. 
> > > > 
> > > > The parser will parse a language that we are using to build 
> > > > databases, and it must support this language 100% if to be 
> > > accepted. 
> > > > 
> > > > Here is the code cutting that I am struggling with.
> > > > 
> > > > method name{
> > > >   SomeText!()text[];
> > > >   if(a < b && b < c){
> > > >      SomeText()!()[];
> > > >   }
> > > >   else{
> > > >      SomeText()!()[];
> > > >   };
> > > > };
> > > > 
> > > > I am not interesting in the expression that is inside the 
name 
> > > > method, I just want ANTLR to grab the text for me, and put it 
> as 
> > a 
> > > > node inside the tree. The problem is the fact that the 
if/else 
> > > > statement is ending with a "};" which is the same token as 
the 
> > > method 
> > > > end token, and I have no guarantee that there could be more 
> that 
> > > one 
> > > > inside the method. A solution would be to make a counter that 
> > will 
> > > > increase for each "{" and decrease for each "}", then I would 
> > know 
> > > > when the method ends. To my frustration I don't know how I 
> should 
> > > > make such a counter in ANTRL, that still supports implement 
in 
> > both 
> > > > Java or C++ code.
> > > > I would be really really happy if someone could help me with 
> this 
> > > > problem!
> > > > 
> > > > Best reagards,
> > > > 
> > > > Per
> > > > 
> > > > 
> > > > 
> > > >  
> > > > 
> > > > Your use of Yahoo! Groups is subject to
> > > > http://docs.yahoo.com/info/terms/ 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > 
> 
**********************************************************************
> > > *************
> > > > 
> > > > This transmission is intended for the named recipient only. 
It 
> may
> > > > contain private and confidential information. If this has 
come 
> to 
> > > you in
> > > > error you must not act on anything disclosed in it, nor must 
> you 
> > > copy
> > > > it, modify it, disseminate it in any way, or show it to 
anyone. 
> > > Please
> > > > e-mail the sender to inform us of the transmission error or 
> > > telephone
> > > > ECA International immediately and delete the e-mail from your
> > > > information system.
> > > > 
> > > > Telephone numbers for ECA International offices are: Sydney 
+61 
> > (0)2
> > > > 9911 7799, Hong Kong + 852 2121 2388, London +44 (0)20 7351 
> 5000 
> > > and New
> > > > York +1 212 582 2333.
> > > > 
> > > > 
> > > 
> > 
> 
**********************************************************************
> > > *************
> > > > 
> > > > 
> > > >  
> > > > 
> > > > Your use of Yahoo! Groups is subject to
> > > > http://docs.yahoo.com/info/terms/
> > 
> > 
> >  
> > 
> > Your use of Yahoo! Groups is subject to
> > http://docs.yahoo.com/info/terms/
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list