[antlr-interest] "everything else" clause

Mon Apr 26 10:24:02 PDT 2004

Hi Ric, 

Thank you for your reply.

> The . alternative always generates non determinism errors 
> since it does not behave like 'if-nothing-matched-so-far-try-this'. 

Yes, now I know.

>In general you can read 
> the generated code and see if the does the right thing. Note 
> that you probably have to move the '.' into the rule with the 
> other alternatives because the parser will never leave the 
> above rule (barring EOF) (I don't think the greedy option 
> will matter in this case).

I could not make it work so I did it a bit differently. The grammar I'm using now for my preprocessor parser starts so:

  // ----- The main entry point -----
  program: 
    (
      define
      | undef
      | ifdef
      | ifndef
      | if_rule
      | include
      | rest
    )*
  ;

  rest:
    (options {greedy = false;}:
      ~(
        "#define"
        | "#undef"
        | "#ifdef"
        | "#ifndef"
        | "#if"
        | "#elif"
        | "#endif"
        | "#else"
        | "#include"
      )
    )+
  ;

  define:
    // The #define syntax is ambiguous because the optional literal could also belong to a 
		// following resource definition.
	  "#define"^ id: IDENTIFIER (options {greedy = true;}: value: literal)?
	  { 
	    // Add this value (or just the identifier if there is no value) to the symbol table.
	  	Object value = evaluator.evaluate(value_AST);
			symbolTable.put(id.getText(), value);
		}
	;

	undef:
	  "#undef"^ id: IDENTIFIER
	  { 
	    // Remove the identifier from the symbol table.
	    symbolTable.remove(id.getText());
	  }
	;

	ifdef:
	  ifstart: "#ifdef"^ resource_identifier program (else_part)? ifend: "#endif"
	  {
	  	// Check if the given symbol exists in our symbol table and strip out...
	  }
	;

	else_part:
    "#else"^ program
    | "#elif"^ expression program (else_part)?
	;
... Etc.	  

Since the "everything else" part in the parser consists of all tokens not mentioned I had to import the lexer
vocabulary. This is no big deal as I need it anyway for my main parser and now the files are parsed correctly so far.
The next thing to todo is to make the TokenRewriteEngine work. It looks like I have to rewrite the grammar a bit because
I cannot address a single token in the else_part from, say, the ifdef rule, hence I cannot strip out the proper part.

> In the action code for the . alternative you will have to 
> make sure to do the right thing with the tokens you don't 
> want to match in the parser. 

As mentioned above my plan is to use Terence's TokenRewriteEngine to strip out the conditional parts and then feed the
result to my main parser. Once this is done I can start over to write a file name subparser.

>Note also that your lexer will 
> have tokenized it in some way. 

Indeed. It seems a bit overkill but requires the least work. The lexer will produce the inital token stream and the
rewrite engine will spit out the modified stream to the main parser. So essentially, the tokenizing is only done once
(as before). For me it sounds like an effective solution. Although, having a true "everything else"
rule/token/you-name-it would still be quite cool for skipping unwanted text parts in the parser (for the lexer there is
this nice filter stuff).

> You will maybe get strange interactions with semantic 
> predicates (depending on the action used, you may be able to 
> cheat that with an init action for the . alternative and 
> check explicitly for guessing mode there).

I only have one action in the lexer: white space skipping. So this is fortunately nothing to take care of for me.

> (Advice read a lot of generated code ;) )

Oh, I do. I learned a lot already from the generated code and the examples. Coming from a yacc/lex background I'm now
addicted to the recursive descent parser design. It is so wonderfully debuggable and errors can be found quite easily.
Great stuff! (many thanks to Terence at this place). I only wonder why there are so few existing grammars. I'd expect
many of them in such a successful package.

Mike
--
www.soft-gems.net

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/