[antlr-interest] Matching tokens only at certain places

Emond Papegaaij e.papegaaij at student.utwente.nl
Mon Jun 19 07:13:12 PDT 2006


On Monday 19 June 2006 15:32, Emond Papegaaij wrote:
> On Monday 19 June 2006 15:03, you wrote:
> > On 6/19/06, Emond Papegaaij <e.papegaaij at student.utwente.nl> wrote:
>
> <CUT howto parse 'iface' (~';') ';'>
>
> The example grammar (in the previous mail) matches everything as a
> METHOD_SIG_ACTION. I've studied the DFA created by ANTLR, and it is clear
> that the only way to reach the IDENTIFIER token is by ending with <EOT>.
> METHOD_SIG_ACTION matches everything, including IDENTIFIERs. Therefore when
> starting to match an IDENTIFIER, it will switch to METHOD_SIG_ACTION as
> soon as it matches something that is not a letter or ';'.

Replying to myself.

I've tried to replace
  METHOD_SIG_ACTION: (~';')+ ;
with
  METHOD_SIG_ACTION: {sig}?=> (~';')+ ;
and set 'sig' to 'true' when the token is valid. However the predicate does 
not have the desired effect. In stead of disabling the token, it disables all 
paths the token could match. This is what the resulting s0 DFA state in the 
generated code looks like:

public DFA.State transition(IntStream input) throws RecognitionException {
  int LA3_0 = input.LA(1);
  if ( LA3_0=='{' && (sig)) {return s1;}
  if ( LA3_0=='}' && (sig)) {return s2;}
  if ( LA3_0=='i' && (sig)) {return s3;}
  if ( LA3_0==';' ) {return s4;}
  if ( (..)||(..)||(..) && (sig)) {return s5;}
  if ( (..)||(..)||LA3_0==' ' && (sig)) {return s6;}
  if ( (..)||..||(..)||(..)||(..)||(..)||..||(..) && (sig)) {return s7;}
  NoViableAltException nvae =
    new NoViableAltException("", 3, 0, input);
  throw nvae;
}

It is clear that this disables all paths except "LA3_0==';'" when 'sig' is 
false. As a result the lexer will only except ';' tokens as long as 'sig' is 
false. Am I using the {..}?=> predicates incorrectly?

Best regards,
Emond Papegaaij


More information about the antlr-interest mailing list