[antlr-interest] Lexer question

Jim Idle jimi at temporal-wave.com
Tue Aug 23 15:39:44 PDT 2011


Except that creates look ahead that you should really left factor anyway
and still allows the whitespace and uses literals in the parser that are
difficult to identify in error messages a the token names are made up on
the fly.


Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Weiler-Thiessen, David, SASKATOON,
> Engineering
> Sent: Tuesday, August 23, 2011 3:07 PM
> To: John B. Brodie; Scott Smith
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Lexer question
>
> Hi
>
> I was going to suggest the same as what John suggests.
>
> If it matters what type of IDENTIFIER it is, you can capture that in a
> rewrite rule if your building an AST
>
> rule1:	IDENTIFIER ':' expression -> ^(NAME_ID expression) ;
>
> rule2:
>                IDENTIFIER '+' expression -> ^('+' IDENTIFIER expression
> )
>        |       IDENTIFIER  '-' expression -> ^('-' IDENTIFIER
> expression )
>       ;
>
>
> David Weiler-Thiessen
> Nestlé Purina PetCare
> phone: 306-933-0232
> cell: 306-291-9770
>
> This e-mail, its electronic document attachments, and the contents of
> its website linkages may contain confidential information. This
> information is intended solely for use by the individual or entity to
> whom it is addressed. If you have received this information in error,
> please notify the sender immediately and promptly destroy the material
> and any accompanying attachments from your system.
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of John B. Brodie
> Sent: Tuesday, August 23, 2011 3:58 PM
> To: Scott Smith
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Lexer question
>
> try
>
> rule1: IDENTIFIER ':' expression ;
>
> that is why bother differentiating the two tokens?
>
> On Tue, 2011-08-23 at 21:19 +0000, Scott Smith wrote:
> > I have a parser that is doing pretty much what I want.  However, I
> want to do the following.
> >
> > I have a definition for an IDENTIFIER
> >
> > IDENTIFIER: LETTER (LETTER | NUMBER)                // LETTER and
> NUMBER mean the usual thing
> >
> > Now in some of my rules, I'm looking for an IDENTIFIER and in one of
> my rules I look for a NAME.  NAME has exactly the same definition as
> IDENTIFIER (starts with a letter followed by alphanumerics).  However,
> you can tell by the token after whether it was a NAME or an IDENTIFIER.
> To be more explicit, a NAME is ALWAYS followed by a colon.  An
> IDENTIFIER can be followed by a number of things, but NEVER by a colon.
> >
> > So, I have rules that looksomething like:
> >
> > rule1:
> >                 NAME ':' expression
> >                 ;
> >
> > rule2:
> >                 IDENTIFIER '+' expression
> >       |       IDENTIFIER  '-' expression
> >      ;
> >
> > I don't seem to be able to make this work.  Can someone suggest a
> solution?  Do I have to turn on backtracking to make this work?
> >
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list