[antlr-interest] Re: Skipping grammar

Anthony W Youngman Anthony.Youngman at ECA-International.com
Wed Oct 8 02:29:38 PDT 2003


Hmmm ...

You should be able to declare that in the lexer.

method: lcurly method_body rcurly ;

protected method_body: name arguments expression ;

Do the curly brackets always indicate a method? If not, how do you tell
whether it's the start of a method or the start of something else? If
you can unabiguously identify the start of a method (eg it's flagged by
an lcurly, which is the only use of an lcurly) then what you appear to
want is pretty simple to achieve.

Solve the problem of how to identify "this is a method", and the rest of
it should just fall into place. If the lexer can recognise "this is a
method" then the lexer can handle methods for you. The parser will then
build your tree for you the way you want it.

I think your original comment about ";" being used to terminate both IFs
and methods is a red herring. Have you grasped why it's not a problem?
If you have, then you should be able to work out the rest of the
solution fairly easily. If you haven't, then you need to get that
straight because it shows a fundamental misunderstanding of ANTLR. Don't
forget, both the lexer and parser are recursive (they "drill down"), so
context-dependent semantics shouldn't be a problem ...

Cheers,
Wol

-----Original Message-----
From: pwolleba [mailto:pwolleba at yahoo.no] 
Sent: 08 October 2003 10:13
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] Re: Skipping grammar


Hello again

Thanks for helping me out Arnar, your solutions are really good! 
Still I think I will have problem implementing them, much because I 
have not given you enough information. 
I need to make a method tag in my tree that contains information, 
such as arguments into the method and such (see example).


Method testMethod (Args,Args....){
	Expression text
}

method
|
|--------Name
|
|--------Arguments
|
|-------- Expression


If I solve this in my lexer I will not be able to create this node 
tree, it will just be one node method that contains all the text. If 
I drop the "method"tag in my METHOD_BODY tag, it will trigger at all 
the other bracket in my document.
Can I somehow make my lexer rule without the "method" tag, and then 
make it just trigger when I need the method body?

best regards,
Per

--- In antlr-interest at yahoogroups.com, "Arnar Birgisson" 
<arnarb at o...> wrote:
> Hello Per,
> 
> Perhaps you could make "method {" a single token in the parser, and 
set
> the nestingLevel variable to zero when that one matches.
> 
> The solution I posted uses the parser to eat up the stuff inside 
{...},
> another possibility might be to make the lexer do this:
> 
> METHOD_BODY
>   : "method"! '{'! ( BracedExpr | ~'}' )* "};"!
>   ;
> 
> protected
> BracedExpr
>   : '{' ( BracedExpr | ~'}' )* "}"
>   ;
> 
> Overall, this might be a better solution. The token METHOD_BODY will
> then contain as it's text whatever was inside the {...}.
> 
> As a side note, this is possible in ANTLR lexers because the are LL
(k)
> and can thus handle context-free grammars. Conventional lexers are
> limited to regular grammars (represented by regular expressions 
which
> are equivalent to finite automata) and can f.x. not match nested 
braces,
> parenthesis etc. See
> http://www.antlr.org/doc/lexer.html#Predicated-LL(k)_Lexing for more
> information on this.
> 
> Arnar
> 
> ps. yes, the "i" should have been "nestingLevel" :o)
> pps. again, I haven't tried this, it might not even be syntactically
> correct
> 
> >>> pwolleba at y... 10/07/03 5:34 PM >>>
> Hello again!
> 
> I am looking at your example Arnar, and I have some questions. 
> When I wrote my example I should have included some more 
information. 
> The methode node is inside of another node called member (see 
> example) and it can be more than one!
> 
> Member{
> Methode {
> 	Sometext;
> };
> };
> 
> This makes your example a bit more difficult to implement, since 
the 
> counter will start a zero at the first bracket, which is the member 
> bracket. I must somehow be able to set nestingLevel = 0 from the 
> parser when the method node is starting.
> How do I do that?
> 
> best regards,
> Per
> 
> Ps: I guess it should be nestingLevel++ instead of i++. Correct?
> 
> --- In antlr-interest at yahoogroups.com, "pwolleba" <pwolleba at y...> 
> wrote:
> > Yes that is correct, what is inside the bracket is a different 
> > language which I at the moment don't want to write a parser for 
(it 
> > is pretty complex and big). Anyway I have just come back to work, 
> and 
> > I am going to try out your solution Arnar, hopefully it will 
work! 
> > 
> > I just want to thank the community for trying to find a solution 
to 
> > my question, and I must say it came really fast!
> > 
> > Best regards,
> > 
> > Per
> > 
> > 
> > --- In antlr-interest at yahoogroups.com, "Arnar Birgisson" 
> > <arnarb at o...> wrote:
> > > Hi..
> > > 
> > > In my earlier post, I understood Per differently. I think he 
> want's 
> > to
> > > parse "method name{ <whatever> };" and just eat up <whatever>, 
> > including
> > > any nested braces, and put it in a variable, completely without 
> > lexing
> > > and/or parsing it. Per, is this correct?
> > > 
> > > The result of all this being a tree something like this:
> > > 
> > > METHOD
> > >  |
> > > name-body
> > > 
> > > where the body node contains anything inside the {..} as it's 
> text.
> > > 
> > > Arnar
> > > 
> > > >>> Anthony.Youngman at E... 10/07/03 1:33 PM >>>
> > > I think you're missing the point. Define a ; as SEMI. The way 
I'd 
> > do it
> > > (and this is all pseudocode) is
> > > 
> > > if_statement: "IF" lcurly (method)* rcurly "ELSE" lcurly 
(method)*
> > > rcurly SEMI ;
> > > method: blah_blah SEMI ;
> > > 
> > > That way, the lexer doesn't care whether ; is ending a method 
or 
> an 
> > if
> > > clause, and the parser won't get confused because when it hits a
> > > right-curly it will be expecting an ELSE or a SEMI, and not a 
> > method.
> > > And if the ELSE is optional you just mark it as such so when 
the 
> > parser
> > > hits the right-curly after the if, it's expecting an ELSE or a 
> SEMI 
> > and
> > > nothing else.
> > > 
> > > Cheers,
> > > Wol
> > > 
> > > -----Original Message-----
> > > From: pwolleba [mailto:pwolleba at y...] 
> > > Sent: 07 October 2003 08:19
> > > To: antlr-interest at yahoogroups.com
> > > Subject: [antlr-interest] Skipping grammar
> > > 
> > > 
> > > I am pretty new to ANTLR so maybe this question is very 
trivial, 
> if 
> > > so even better then maybe it is a simple solution to my 
problem. 
> > > Anyway I am struggling with writing a new parser in ANTLR to 
> > replace 
> > > and old implementation in Flex/Bison, this to make a product 
that 
> > are 
> > > open for implementation from both C++ as well as Java. 
> > > 
> > > The parser will parse a language that we are using to build 
> > > databases, and it must support this language 100% if to be 
> > accepted. 
> > > 
> > > Here is the code cutting that I am struggling with.
> > > 
> > > method name{
> > >   SomeText!()text[];
> > >   if(a < b && b < c){
> > >      SomeText()!()[];
> > >   }
> > >   else{
> > >      SomeText()!()[];
> > >   };
> > > };
> > > 
> > > I am not interesting in the expression that is inside the name 
> > > method, I just want ANTLR to grab the text for me, and put it 
as 
> a 
> > > node inside the tree. The problem is the fact that the if/else 
> > > statement is ending with a "};" which is the same token as the 
> > method 
> > > end token, and I have no guarantee that there could be more 
that 
> > one 
> > > inside the method. A solution would be to make a counter that 
> will 
> > > increase for each "{" and decrease for each "}", then I would 
> know 
> > > when the method ends. To my frustration I don't know how I 
should 
> > > make such a counter in ANTRL, that still supports implement in 
> both 
> > > Java or C++ code.
> > > I would be really really happy if someone could help me with 
this 
> > > problem!
> > > 
> > > Best reagards,
> > > 
> > > Per
> > > 
> > > 
> > > 
> > >  
> > > 
> > > Your use of Yahoo! Groups is subject to
> > > http://docs.yahoo.com/info/terms/ 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > 
> 
**********************************************************************
> > *************
> > > 
> > > This transmission is intended for the named recipient only. It 
may
> > > contain private and confidential information. If this has come 
to 
> > you in
> > > error you must not act on anything disclosed in it, nor must 
you 
> > copy
> > > it, modify it, disseminate it in any way, or show it to anyone. 
> > Please
> > > e-mail the sender to inform us of the transmission error or 
> > telephone
> > > ECA International immediately and delete the e-mail from your
> > > information system.
> > > 
> > > Telephone numbers for ECA International offices are: Sydney +61 
> (0)2
> > > 9911 7799, Hong Kong + 852 2121 2388, London +44 (0)20 7351 
5000 
> > and New
> > > York +1 212 582 2333.
> > > 
> > > 
> > 
> 
**********************************************************************
> > *************
> > > 
> > > 
> > >  
> > > 
> > > Your use of Yahoo! Groups is subject to
> > > http://docs.yahoo.com/info/terms/
> 
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/


 

Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/ 



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list