[antlr-interest] Some ANTLR questions

Foolish Ewe foolishewe at hotmail.com
Fri Nov 10 11:44:53 PST 2006


Jim and Antlr-Interest Readers:

Thanks for all your help.  I have some questions about the recommended 
approach
for tree construction in the following cases.

In general, the questions are for when there are fixed suffixes with 
arbitrary or optional
repetitions of a prefix/infix with alternatives.

Consider the following cases:

case 1) a parse rule of the form
           a: B (C)+ D;

           I think a preferred AST construction rule is of the form

           tokens{
               A;
               // other tokens
           }
           a: b=B (c+ =C)+ d=d-> ^(A $b $c+ $d);

           Would we use the same approach if there were zero or more 
repetitions?
           i..e the rule was of the form

           a: B (C)* D;

case 2) suppose instead that instead of a repeated rule we have a "do it or 
skip it" of the form

           a: B (C)? D;

          should the rule look like?

          a: b=B (c+ =C)? d=d-> ^(A $b $c+ $d)

case 3) suppose instead we have an infix of alternatives (but perhaps we 
would not
           want the infix values to be roots like say  + or - operators in 
an expression
           tree for example) of the form

         a: B  (C | D) E;

         Must we refactor to distribute suffix concatenation over the 
alternatives
         and have something of the form:

         a: b=B ( (c=C e1=E)->^(A $b $c $e1)  | (d=D e2=E)->^(A $b $d $e2));

          Or is there a smarter way?

case 4) Suppose that instead we have repetitions of infix alternatives so 
that the rule reads:

         a: B (C | D)+ E;

         What form should the solution take, would we use something like:

         a: b=B (c+=C | d+=D) e=E->^(A $b $c+ $d+ $e);

Sorry for troubling you all with this, I'm a bit new to ANTLR's AST 
construction.

Regards:

Bill M.
>From: "Jim Idle" <jimi at intersystems.com>
>To: "Foolish Ewe" <foolishewe at hotmail.com>,<antlr-interest at antlr.org>
>Subject: RE: [antlr-interest] Some ANTLR questions
>Date: Thu, 9 Nov 2006 13:07:04 -0500
>
>1) UP and DOWN are indicators of tree structure when the output is an AST. 
>There are a few other things like this, depending on your target language 
>too. So, yes you cannot use these. ANTLR3 error checking is a bit lax right 
>now as this will be filled in when ANTLR3 is written in ANTLR3. It becomes 
>"obvious" as you use it right now ;-)
>2) There are differing opinions of course, but if you are doing anything 
>other than just straight output of something that can be done without 
>reference to much else you have parsed, then always use a tree. The tree 
>parser is easy to construct as the syntax of it is basically the same as 
>the rewrite rules in the parser. You will need to cover all the structure 
>in your parser really, but it isn't onerous (IMO).
>3) Both ASTERISK and ALPHAblah are lexer rules so antlr will match the 
>first one it finds basically. If you define ASTERISK as a fragment and then 
>use it in two other rules, in order of which should be matched first, you 
>will probably have more success, something like:
>
>fragment
>ASTERISK: '*' ;
>fragment
>ALPHANUM: 'a'..'z' | 0'..'9';
>
>KEY1: 'X*Y' ;
>KEY2: 'X' ;
>WILD: ALPHANUM+ ASTERISK;
>MULT: ASTERISK;
>
>You can of course use lexer predicates if it isn’t as simple as ordering. 
>If only the parser can distinguish because of context, then create LEXER 
>rules that supply the bits for the parser to do so using whatever 
>predicates might be needed, something like:
>
>expr: id (MULT^^ id)*;
>wild: ID (   (MULT)=> wild=MULT)?
>	-> {wild != null}? ^(WILDID $ID)
>...
>
>ID: ALPHANUM+;
>MULT: '*';
>
>-----Original Message-----
>From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>
>Hello All:
>
>At this point, I've gotten the entire language I'm working on recognized 
>and
>have had some good experiences for the most part with ANTLRworks (although
>I've
>tripped a few of those sorts of things that might be keeping it in Beta).
>I have a few questions
>
>1) In my language UP and DOWN are keywords, yet when I tried to create 
>rules
>to scan
>     them, ANTLRworks treated them strangely, so I changed the token names 
>to
>MYUP and MYDOWN
>     and it worked.  Are UP and DOWN keywords in ANTLR or Java (my parser
>creates a Java file).
>2) I'm now ready to actually do something with the language I'm 
>recognizing,
>and I'm wondering
>     if the smart thing is to go with an AST or to try to use rules to do
>attribute propagation
>     (back when dinosaurs ruled the earth, i did a lex/yacc, flex/bison 
>style
>compiler which uses
>       rules to do it).  If I do an AST, do I need to give AST management
>annotations to each
>       production, or can I do it incrementally for unit testing?
>3) I had some scanning issues, since my language includes undelimited
>strings (sort of like
>     identifiers in many languages).  I created a parse rule that matched 
>all
>alpha numeric
>     keywords and created  scanning rules at the end which looked like:
>ASTERISK: '*';
>//The following rule was not well received by the scanner :-(
>ALPHANUMSTRINGWITHWILDCARD	:  (ALPHANUM)+ ASTERISK;
>// I think this needs to be last to avoid recognizing keywords.
>NONKEYWORDUNDELIMITEDSTRING	: (ALPHANUM)+;
>
>     ANTLR3 and ANTLRworks didn't like it and the debugger hung, leaving 
>the
>TCP/IP
>     port (I think it was 49153) unavailable under Windows XP (my 
>employer's
>development
>     environment) and preventing future debugging attempts (I didn't bother
>resetting
>     the debugger's port number) until reboot.
>
>     As a work around, I created a rule:
>alphaNumStringWithWildcard
>	:
>	NONKEYWORDUNDELIMITEDSTRING ASTERISK // this is a bit of a hack, it allows
>white space between the two
>	;
>
>       But that since I have a rule that discards white space by sending it
>to channel=99, then
>       this should accept say 'abc*' and 'abc *' so it is  a bit over
>permissive, so I would rather
>       use the lexer (or fix this rule somehow).
>       What is the best fix for this kind of rule?
>
>Regards:
>
>Bill M.
>
>_________________________________________________________________
>Try Search Survival Kits: Fix up your home and better handle your cash with
>Live Search!
>http://imagine-windowslive.com/search/kits/default.aspx?kit=improve&locale=en-US&source=hmtagline
>
>
>--
>No virus found in this incoming message.
>Checked by AVG Free Edition.
>Version: 7.5.430 / Virus Database: 268.14.0/524 - Release Date: 11/8/2006 
>1:40 PM
>
>
>--
>No virus found in this outgoing message.
>Checked by AVG Free Edition.
>Version: 7.5.430 / Virus Database: 268.14.0/524 - Release Date: 11/8/2006 
>1:40 PM
>

_________________________________________________________________
Find a local pizza place, music store, museum and more…then map the best 
route!  http://local.live.com?FORM=MGA001



More information about the antlr-interest mailing list