[antlr-interest] Avoiding warnings without code bloat
David Piepgrass
qwertie256 at gmail.com
Mon Jun 25 15:07:32 PDT 2007
> Consult the examples for ways of doing this, you will find that the C
> parser and Java parser are set up to handle this.
> Jim
Actually, the C and Java parsers seem to do exactly what I tried to
do! Look at this from the C example:
STRING_LITERAL
: '"' ( EscapeSequence | ~('\\'|'"') )* '"'
;
fragment
EscapeSequence
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| OctalEscape
;
And the following is from Java.g:
StringLiteral
: '"' ( EscapeSequence | ~('\\'|'"') )* '"'
;
fragment
EscapeSequence
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
| UnicodeEscape
| OctalEscape
;
Compare what I tried:
SQ_STRING: '\''! (ESC_SEQ | ~('\'' | '\\'))* '\''!;
DQ_STRING: '"'! (ESC_SEQ | ~('\"' | '\\'))* '"'!;
fragment ESC_SEQ:
| '\\r' {$text = "\r";}
| '\\n' {$text = "\n";}
| '\\t' {$text = "\t";}
| '\\a' {$text = "\a";}
| '\\b' {$text = "\b";}
| '\\f' {$text = "\f";}
| '\\0' {$text = "\0";}
| '\\u' HEXDIGIT HEXDIGIT HEXDIGIT HEXDIGIT { ... }
| '\\'! '\''
| '\\'! '\"'
| '\\'! '\`';
warning(200): Expr.g:68:41: Decision can match input such as "'\''"
using multiple alternatives: 1, 3
As a result, alternative(s) 3 were disabled for that input
warning(200): Expr.g:68:41: Decision can match input such as
"{'\u0000'..'&', '('..'\uFFFE'}" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(201): Expr.g:68:41: The following alternatives are unreachable: 2,3
warning(200): Expr.g:69:40: Decision can match input such as "'"'"
using multiple alternatives: 1, 3
As a result, alternative(s) 3 were disabled for that input
warning(200): Expr.g:69:40: Decision can match input such as
"{'\u0000'..'!', '#'..'\uFFFE'}" using
multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(201): Expr.g:69:40: The following alternatives are unreachable: 2,3
If the C example gives no warnings, I don't know why there is a difference.
But as I pointed out, I would like to know how to accept inputs with
invalid escapes like "\Q". A simple solution would be to add this
extra alt at the end of ESC_SEQ:
| '\\' .;
But this produces a crapload of warnings. This can be avoided by writing
| '\\' ~('r'|'n'|'t'|'a'|'b'|'f'|'0'|'u'|'\''|'"'|'`');
instead, but it's a tedius solution (and it doesn't generalize very
well to more complicated scenarios.)
More information about the antlr-interest
mailing list