[antlr-interest] Have I found an Antlr CSharp3 lexer bug if...
Sam Harwell
sharwell at pixelminegames.com
Thu Aug 4 07:04:08 PDT 2011
Hi Chris,
I'm using the released version 3.4.0 of the ANTLR CSharp3 target. I
copy/pasted the grammar below (aside from renaming it to Preprocessor) and
it passed the following unit test.
[TestMethod]
public void TestEmptyComment()
{
string inputText = "/**/";
var input = new ANTLRStringStream(inputText);
var lexer = new PreprocessorLexer(input);
var tokenStream = new CommonTokenStream(lexer);
tokenStream.Fill();
List<IToken> tokens = tokenStream.GetTokens();
Assert.AreEqual(2, tokens.Count);
Assert.AreEqual(PreprocessorLexer.DELIMITED_COMMENT, tokens[0].Type);
Assert.AreEqual(inputText, tokens[0].Text);
Assert.AreEqual(PreprocessorLexer.EOF, tokens[1].Type);
}
Sam
From: chris king [mailto:kingces95 at gmail.com]
Sent: Thursday, August 04, 2011 3:48 AM
To: Sam Harwell; antlr-interest at antlr.org
Subject: Re: Have I found an Antlr CSharp3 lexer bug if...
Sam, while trying build my pre-processor with a mixed parser/lexer I ran
across what I think might be a bug. I reduced the repro below. I expected
the program below to accept "/**/ " but instead fails because the lexer
prediction enters PP_SKIPPED_CHARACTERS. That rule has a gated semantic
predicate which is always false. I expected a lexer rule with a gated
semantic predicate which is always false to never be matched. If I comment
out the PP_SKIPPED_CHARACTERS rule then it does match "/**/ ". So the
inclusion of that rule is cause the problem. Let me know if you think this
is a bug and if you can repro.
Thanks,
Chris
grammar Bug;
options {
language=CSharp3;
output=AST;
}
public start
: DELIMITED_COMMENT !EOF
;
PP_SKIPPED_CHARACTERS
: { false }? => ~(F_NEW_LINE_CHARACTER | F_PP_POUND_SIGN)
F_INPUT_CHARACTER*
;
DELIMITED_COMMENT
: { true }? => '/*' .* '*/'
;
WHITESPACE
: F_WHITESPACE {skip();}
;
fragment F_WHITESPACE
: (' ' | '\t' | '\v' | '\f')+
;
fragment F_NEW_LINE_CHARACTER
: '\r'
| '\n'
;
fragment F_PP_POUND_SIGN
: '#'
;
fragment F_INPUT_CHARACTER
: ~F_NEW_LINE_CHARACTER
;
More information about the antlr-interest
mailing list