[antlr-interest] Re: lexical nondeterminism warning

tinker tinker at sogetthis.com
Thu Jan 5 01:52:18 PST 2006


Hi again,
  Well, I know how to restructure my lexer rules so that the above
errors dissappear, but that isn't exactly what I want. Let me go into
more detail and explain:
   The language can have two types of comments embedded in it:
   - Single line comments: These begin with a ' character and last
till the end of line
       ' This is a single line comment
   - Multiline comment: These are just like the HTML comments.
            <!-  this is a multiline comment  -->

Now I can define the following two rules to handle these comments:
---------------------------------------------
 MULTI_COMMENT
  :
  "<!-"
      (options {
			generateAmbigWarnings=false;
		  }:
      {!(LA(2)=='-' && LA(3)=='>')}? '-' // allow '-' if not "-->"
        | WS
        | ~( '-' | '\n'|'\r')
      )*
    "-->";

 SINGLE_COMMENT
  :
    '\'"
      (
        {LA(2) != '>'}? '%' // the script is embedded within <% %> tags
        | ~('\n' | '%')
      )*
  {	if (LA(1) == '\n')
	{
		match('\n');
		newline();
	}
  };
 ---------------------------------------------
Now, when I compile this with antlr 2.7.6, i get no errors at all.
However, this introduces two tokens in the input stream, one for each
type of comment. Instead of this, I want that there be only one token
for all type of comments in the file. So I defined another rule as
follows:
 ---------------------------------------------
protected  MULTI_COMMENT
 :
.....
;
protected  SINGLE_COMMENT
 :
.....
;

COMMENT
:
 SINGLE_COMMENT
 |
  MULTI_COMMENT
;
 ---------------------------------------------
But when I try to compile the grammar now, I get the warnings again.
They are the same warnings as before, and are reproduced below:
=========================================
warning:lexical nondeterminism between rules LE and COMMENT upon
    k==1:'<'
    k==2:'<','='
    k==3:<end-of-token>
warning:lexical nondeterminism between rules NEQ and COMMENT upon
    k==1:'<'
    k==2:'>'
    k==3:<end-of-token>
warning:lexical nondeterminism between rules END and COMMENT upon
    k==1:'<'
    k==2:'/'
    k==3:'s'
=========================================

So can anyone tell me why this is happening, and what I can do to get
around this?

Thanks,
T


More information about the antlr-interest mailing list