[antlr-interest] newbie question about nondeterminism between keywords and identifiers

Martin Nordin martin.nordin at gmail.com
Fri Feb 2 20:58:00 PST 2007


Hi David.

I think you'll have to move the type identifying stuff to the parser.

As I'm quite a newbie myself I don't know if this is what you need or event
if it is a good approach or not.

If you have more types than just date you probably have to add a type-rule
in the parser:

type : ( "date" | "int" ) ;

and change decl to use type instead of "date".

Here is my test grammar, you need to run it through a debugger to see that
it actually does something:

header { import java.io.*; }

class DateLexer extends Lexer;

options { k=1; }
WS
  :
  (' '
  | '\t'
  | '\r' '\n' { newline(); }
  | '\n'      { newline(); }
  )
  { $setType(Token.SKIP); } ;

IDENT
options {testLiterals=true;}
  : ('_'|'a'..'z')('_'|'a'..'z'|'0'..'9')*
  ;

COLON  : ':';
SEMI   : ';';

class DateParser extends Parser;

{

  // a sample main
  public static void main(String[] args)
  {

    // Use a try/catch block for parser exceptions
    try {
      InputStream input  = new StringBufferInputStream("date1 : date; date2
: date;");
      DateLexer   lexer  = new DateLexer(input);
      DateParser  parser = new DateParser(lexer);
      parser.declarations();
    }
    catch (Exception e) {
      System.err.println("parser exception: "+e);
      e.printStackTrace();   // so we can get stack trace
    }
  }
}

decl:
  IDENT COLON "date" SEMI
  ;

declarations :
  (decl)*
  ;

Regards,
Martin


On 2/1/07, David Guy <dguy at bea.com> wrote:
>
>  I have a typical lexer IDENT rule:
>
> IDENT
>
> options {testLiterals=true;}
>
>    : ('_'|'a'..'z')('_'|'a'..'z'|'0'..'9')*
>
>   ;
>
> The language has some built in types. For example (from lexer):
>
>
>
> TYPE_DATE   :"date";
>
> // declares type
>
> COLON        : ':';
>
>
>
> In my parser, if I have a rule like:
>
>
>
> decl:
>
> IDENT COLON TYPE_DATE
>
> ;
>
>
>
> I cannot parse "mydate : date" or "date_foo : date". The first example
> gets IDENT than unexpected TYPE_DATE and the second case gets unexpected
> TYPE_DATE.
>
>
>
> I know this is very basic stuff, but I have looked at sample Java grammars
> and don't see anything different and of course in Java you can say
>
> int myint; int int_xxx;
>
>
>
> _______________________________________________________________________
> Notice:  This email message, together with any attachments, may contain
> information  of  BEA Systems,  Inc.,  its subsidiaries  and  affiliated
> entities,  that may be confidential,  proprietary,  copyrighted  and/or
> legally privileged, and is intended solely for the use of the individual
> or entity named in this message. If you are not the intended recipient,
> and have received this message in error, please immediately return this
> by email and then delete it.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070203/01bfdbf7/attachment-0001.html 


More information about the antlr-interest mailing list