[antlr-interest] Lexer predicates...why don't they work for me?

Mon Aug 27 10:23:14 PDT 2007

Jim,

That's EXACTLY what I've been trying to get it to do for the past 4
months!  I just checked and it works!  I've made a whole bunch of
work-arounds, but finally I hit some text that none of my work arounds
could handle, lol.  Thanks!  (and thanks for the info on how the lexer
works as well)

Matt 

> -----Original Message-----
> From: Jim Idle [mailto:jimi at temporal-wave.com] 
> Sent: Sunday, August 26, 2007 9:16 AM
> To: Diehl, Matthew J; antlr
> Subject: RE: [antlr-interest] Lexer predicates...why don't 
> they work for me?
> 
> Try:
> 
> fragment APOSTROPHE : '\'';
> 
> CHARLIT : '\''
>             (
>                  (. '\'')=> . '\''
>                | { $type = APOSTROPHE; }
>             )
>         ;
> 
> ANTLR cannot see beyond the end of the rule/outside the rule, and you
> created two rules that can trigger the use of '\''. Hence it decided
> that if it sees '\'' it will start looking down the CharacterLiteral
> path. Your predicate (you could use that in the rule above of course)
> merely serves to tell the rule that this isn't what it should be
> matching, but gives it no alternative, hence you get a failed 
> predicate
> error. So, what you want is to trigger both tokens by their 
> common root
> '\'', then distinguish between the two at that point. Then you supply
> two alternatives distinguished by your predicate and it 
> should all work
> as you require.
> 
> Of course, this does not distinguish: 'C''''
> 
> Which would become 2 CHARLITS and may not be what you want. You might
> need to process '\'' for instance?
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Diehl, Matthew J
> > Sent: Saturday, August 25, 2007 2:13 PM
> > To: antlr
> > Subject: [antlr-interest] Lexer predicates...why don't they work for
> > me?
> > 
> > Hi,
> > 
> > The lexer is the part of ANTLR that I do not understand at all.  I
> > think
> > I understand what it's doing, but obviously I don't.  It 
> always feels
> > like it is a LL(1) lexer.  For instance if I have the 
> following rules:
> > 
> > Apostrophe : '\''
> > CharacterLiteral : Apostrophe (.) Apostrophe ;
> > 
> > Given an input of:
> > foo = '0'; --works fine (token = CharacterLiteral)
> > Foo = signalA'RANGE --doesn't work.  It throws a lexer error saying
> > that
> > 'A' is not an apostrophe (''')
> > In this case I would like it to just return ''' as Apostrophe.
> > 
> > I tried using predicates:
> > CharacterLiteral : (Apostrophe (.) Apostrophe)=> Apostrophe (.)
> > Apostrophe ;
> > 
> > And also:
> > CharacterLiteral : Apostrophe (.) Apostrophe
> >                  | Apostrophe {$type=Apostrophe;} ;
> >    /*same error as above*/
> > CharacterLiteral : {input.LA(3)==Apostrophe}? Apostrophe (.)
> Apostrophe
> > ;
> >    /*threw a 'did not pass predicate' error */
> > 
> > But none of it's working.  What am I doing wrong?  Thanks for your
> time
> > and consideration.
> > 
> > Matt
>