[antlr-interest] Have I found an Antlr CSharp3 lexer bug if...
Sam Harwell
sharwell at pixelminegames.com
Thu Jul 28 14:37:05 PDT 2011
Hi Chris,
Lookahead prediction occurs before predicates are evaluated. If fixed
lookahead uniquely determines the alternative with a semantic predicate,
the predicate will not be evaluated as part of the decision process. I'm
guessing (but not 100% sure) if you use a gated semantic predicate, then it
will not be entering the rule:
PP_SKIPPED_CHARACTERS
: {false}? => ( ~(F_NEW_LINE_CHARACTER | '#') F_INPUT_CHARACTER*
F_NEW_LINE )*
;
Also, a word of warning: this lexer rule can match a zero-length character
span, which could result in an infinite loop. You should always ensure that
every path through any lexer rule that's not marked "fragment" will consume
at least 1 character. There's also a bug with certain exceptions in the
lexer that can cause infinite loops - this has been resolved for release 3.4
but I haven't released it yet.
Sam
From: chris king [mailto:kingces95 at gmail.com]
Sent: Thursday, July 28, 2011 4:19 PM
To: antlr-interest at antlr.org; Sam Harwell
Subject: Have I found an Antlr CSharp3 lexer bug if...
Have I found an Antlr lexer CSharp3 bug if I can alter program execution
(exception instead of no exception) by introducing a lexer production with a
predicate that is always false? For example
PP_SKIPPED_CHARACTERS
: { false }? ( ~(F_NEW_LINE_CHARACTER | '#') F_INPUT_CHARACTER* F_NEW_LINE
)*
;
I would think that such a production should always be ignored because it's
predicate is always false and therefore would never alter program execution.
Yet I'm seeing a change in the execution of my program. I'm seeing it enter
this function and throw a FailedPredicateException. I wouldn't have expected
that this function should ever even have been executed because the predicate
is always false.
[GrammarRule("PP_SKIPPED_CHARACTERS")]
private void mPP_SKIPPED_CHARACTERS()
{
EnterRule_PP_SKIPPED_CHARACTERS();
EnterRule("PP_SKIPPED_CHARACTERS", 31);
TraceIn("PP_SKIPPED_CHARACTERS", 31);
try
{
int _type = PP_SKIPPED_CHARACTERS;
int _channel = DefaultTokenChannel;
// CSharp\\CSharpPreProcessor.g:197:3: ({...}? (~ (
F_NEW_LINE_CHARACTER | F_POUND_SIGN ) ( F_INPUT_CHARACTER )
DebugEnterAlt(1);
// CSharp\\CSharpPreProcessor.g:197:5: {...}? (~ (
F_NEW_LINE_CHARACTER | F_POUND_SIGN ) ( F_INPUT_CHARACTER )
{
DebugLocation(197, 5);
if (!(( false )))
{
throw new FailedPredicateException(input,
"PP_SKIPPED_CHARACTERS", " False() ");
}
Sam, I'm on an all CSharp stack v3.3.1.7705. I'm using your VS plugin (which
is wonderful) and build integration to generate the lexer/parser (also
wonderful) and then running on top of your CSharp port of the runtime. If
you think this is a bug and you'd like to have a look at the repro please
let me know. The project is open source up on CodePlex.
Thanks,
Chris
More information about the antlr-interest
mailing list