[antlr-interest] Avoiding warnings without code bloat
David Piepgrass
qwertie256 at gmail.com
Mon Jun 25 16:33:29 PDT 2007
Problem Solved!
On p.285 of the ANTLR book it implies that you cannot suppress
warnings in ANTLR v3 like you could in v2.
However, it appears that a semantic predicate works nicely as a workaround:
// Strings
SQ_STRING: '\''! ({true}? ESC_SEQ | ~'\'')* '\''!;
DQ_STRING: '"'! ({true}? ESC_SEQ | ~'"')* '"'!;
BQ_STRING: '`'! ({true}? ESC_SEQ | ~'`')* '`'!;
fragment ESC_SEQ:
'\\r' {$text = "\r";}
| '\\n' {$text = "\n";}
| '\\t' {$text = "\t";}
| '\\a' {$text = "\a";}
| '\\b' {$text = "\b";}
| '\\f' {$text = "\f";}
| '\\0' {$text = "\0";}
| '\\u' HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT
{
char ch = (char)int.Parse(Text, System.Globalization.NumberStyles.HexNumber);
$text = new string(ch, 1);
}
| '\\'! '\''
| '\\'! '\"'
| '\\'! '\`';
The generated code now contains LL(2) lookahead for no reason, and
some redundant code. For example, code that originally read
if ( (LA19_0 == '\"') )
{
alt19 = 1;
}
Now says
if ( (LA19_0 == '\"') )
{
int LA19_1 = input.LA(2);
if ( (true) )
{
alt19 = 1;
}
}
However, its behavior appears to be the same.
The compiler will emit some "unreachable code" warnings. In C# you can
disable them like this:
grammar Expr;
options {
language=CSharp;
}
@lexer::members {
#pragma warning disable 0162
}
@parser::members {
#pragma warning disable 0162
}
I think there may be a caveat: using {true}? on a nullable rule can
lead to an infinite loop if there is a syntax error in the input
stream (i.e. don't say "{true}? foo" if foo can match no input).
> I'm trying to match strings with escape sequences, so I tried this:
>
> // Strings
> SQ_STRING: '\''! (ESC_SEQ | ~'\'')* '\''!;
> DQ_STRING: '"'! (ESC_SEQ | ~'"' )* '"'!;
> fragment ESC_SEQ:
> | '\\r' {$text = "\r";}
...
> | '\\'! '"';
>
> But this produces 12 warnings, and this makes sense because in the
> first three lines, escape sequences can match the first and second
> alternatives.
...
> I can eliminate all warnings using a syntactic predicate:
>
> SQ_STRING: '\''! ((ESC_SEQ)=>ESC_SEQ | ~'\'')* '\''!;
> DQ_STRING: '"'! ((ESC_SEQ)=>ESC_SEQ | ~'\"')* '"'!;
>
> However, this changes the generated code substantially; not only does
> the lexer test for an ESC_SEQ before matching it, but ALL lexer rules,
> including rules that are in no way related to strings, have additional
> lines of code such as "if (failed) return ;" sprinkled throughout
> them.
>
> So my question is, can I get the "bloat-free" behavior of the original code:
...
> while suppressing the warnings?
More information about the antlr-interest
mailing list