[antlr-interest] Have I found an Antlr CSharp3 lexer bug if...

Sam Harwell sharwell at pixelminegames.com
Thu Jul 28 14:37:05 PDT 2011


Hi Chris,

 

Lookahead prediction occurs before predicates are evaluated. If fixed
lookahead uniquely determines the alternative with a  semantic predicate,
the predicate will not be evaluated as part of the decision process. I'm
guessing (but not 100% sure) if you use a gated semantic predicate, then it
will not be entering the rule:

 

PP_SKIPPED_CHARACTERS

  : {false}? => ( ~(F_NEW_LINE_CHARACTER | '#') F_INPUT_CHARACTER*
F_NEW_LINE )*

  ;

 

Also, a word of warning: this lexer rule can match a zero-length character
span, which could result in an infinite loop. You should always ensure that
every path through any lexer rule that's not marked "fragment" will consume
at least 1 character. There's also a bug with certain exceptions in the
lexer that can cause infinite loops - this has been resolved for release 3.4
but I haven't released it yet.

 

Sam

 

From: chris king [mailto:kingces95 at gmail.com] 
Sent: Thursday, July 28, 2011 4:19 PM
To: antlr-interest at antlr.org; Sam Harwell
Subject: Have I found an Antlr CSharp3 lexer bug if...

 

Have I found an Antlr lexer CSharp3 bug if I can alter program execution
(exception instead of no exception) by introducing a lexer production with a
predicate that is always false? For example

 

PP_SKIPPED_CHARACTERS

  : { false }? ( ~(F_NEW_LINE_CHARACTER | '#') F_INPUT_CHARACTER* F_NEW_LINE
)*

  ;

 

I would think that such a production should always be ignored because it's
predicate is always false and therefore would never alter program execution.
Yet I'm seeing a change in the execution of my program. I'm seeing it enter
this function and throw a FailedPredicateException. I wouldn't have expected
that this function should ever even have been executed because the predicate
is always false.

 

     [GrammarRule("PP_SKIPPED_CHARACTERS")]

     private void mPP_SKIPPED_CHARACTERS()

     {

          EnterRule_PP_SKIPPED_CHARACTERS();

          EnterRule("PP_SKIPPED_CHARACTERS", 31);

          TraceIn("PP_SKIPPED_CHARACTERS", 31);

          try

          {

              int _type = PP_SKIPPED_CHARACTERS;

              int _channel = DefaultTokenChannel;

              // CSharp\\CSharpPreProcessor.g:197:3: ({...}? (~ (
F_NEW_LINE_CHARACTER | F_POUND_SIGN ) ( F_INPUT_CHARACTER )

              DebugEnterAlt(1);

              // CSharp\\CSharpPreProcessor.g:197:5: {...}? (~ (
F_NEW_LINE_CHARACTER | F_POUND_SIGN ) ( F_INPUT_CHARACTER )

              {

              DebugLocation(197, 5);

              if (!(( false )))

              {

                   throw new FailedPredicateException(input,
"PP_SKIPPED_CHARACTERS", " False() ");

              }

 

Sam, I'm on an all CSharp stack v3.3.1.7705. I'm using your VS plugin (which
is wonderful) and build integration to generate the lexer/parser (also
wonderful) and then running on top of your CSharp port of the runtime. If
you think this is a bug and you'd like to have a look at the repro please
let me know. The project is open source up on CodePlex. 

 

Thanks,
Chris



More information about the antlr-interest mailing list