[antlr-interest] Lexer rule match keyword or identifierdepending on the previous token

Thu Aug 23 12:43:01 PDT 2007

I'll have to look this over quite a few times.
I'm not savvy enough yet to understand how this works.

W.

-----Original Message-----
From: Guillaume Chavanon [mailto:guillaume.chavanon at systemsvip.com] 
Sent: Thursday, August 23, 2007 8:27 AM
To: Edwards, Waverly
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Lexer rule match keyword or
identifierdepending on the previous token

Hi,

I just wrote a function (in java) that look for the special character (a
caret) backward in the TokenStream. We have to pass it as a lexer
member.
The function skip hidden tokens. It can be use in a gated semantic
predicate in the lexer to match "int", "char" or "string" as keyword or
identifier.
I have successfully parse the following input :

int i1 = 30 ;
char c1 = 'h' ;
int i2 = c^int ;
string s1 = "hello" ;
char c2 = s1^ char(2) ;
int i3 = s1^char(2)^int ;

Where "int" and "char" after a caret are set as IDENTIFIER tokens
whereas others are tokens of type INT, CHAR or STRING.
Even second "char" in line 5 witch previous token is a WS.

See attach grammar for more details.

Guillaume Chavanon

Edwards, Waverly wrote:
> Interesting enough.  This afternoon I thought of a case that I will 
> need to deal with that works exactly like what you described.  Looking

> ahead is not going to be the same as looking back or at the very least

> the logic is going to be more complicated looking ahead than looking 
> back.  When I get to that point in my grammar I'll let you know what I

> did.  I want to deal with the case in the lexer and not the parser so 
> this should be interesting.  I don't know if there is a 
> 'previousToken' ( I doubt it ) since storing each token seems like a 
> waste I'll try to come up with something
>
>
> W.
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org 
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Edwards, 
> Waverly
> Sent: Monday, August 20, 2007 9:22 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Lexer rule match keyword or 
> identifierdepending on the previous token
>
>  
> Is this different that looking ahead?  If you are at token B and token

> A determines how your respond to token B then wouldn't this be the 
> same as being at token A, looking ahead and seeing token B then 
> responding the same way?  I don't know the circumstance so I don't 
> really know.  I haven't figured out how the lookahead in ANTLR works 
> just yet but I know it exists.
>
>
> W.
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org 
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Guillaume 
> Chavanon
> Sent: Monday, August 20, 2007 5:45 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Lexer rule match keyword or identifier 
> depending on the previous token
>
> Hi all,
>
> I am writing a grammar for a language that allow some keywords to be 
> identifiers if they are preceded by a special character.
> Is it possible to add gated semantic predicate in a lexer rule that 
> will test the previous token ?
> I would like to do something like this :
>
> IDENTIFIER
>     : { previousToken != ... }?=> 'KEYWORD1' {$type=KEYWORD1;}
>     | { previousToken != ... }?=> 'KEYWORD2' {$type=KEYWORD2;}
>     | 'A'..'Z' ( '_'? ( 'A'..'Z' | '0'..'9' ) )*
>     ;
>
> Thanks in advance,
> Guillaume Chavanon
>