[antlr-interest] should lexical rules with identical content be treated equally?
Claude Quézel
cquezel at mechatools.com
Fri Oct 12 12:59:19 PDT 2007
I'm very new to antlr and this may be a trivial error of mine but here it
goes:
Given this grammar (cut down for this post):
paragraph : Non_variable;
protected Variable : '{' DVCONTENT '.' ID '}' ; // **** this is the line
that puzzles me! ****
protected Non_variable : (ESCAPED_SEQUENCE | CHARACTERS)*;
fragment CHARACTERS : ~SPECIAL_CHARACTERS;
fragment ESCAPED_SEQUENCE : '\\' SPECIAL_CHARACTERS;
fragment SPECIAL_CHARACTERS : ('{' | '}' | '\\' );
fragment DVCONTEXT : ID;
fragment ID : IDLETTER (IDLETTER | DIGIT)*;
fragment IDLETTER : 'a'..'z'|'A'..'Z';
fragment DIGIT : '0'..'9';
if I execute this with the following input: "a\nb", I get a "line 1:1
mismatched character '\n' expecting set null" warning.
if I change the "puzzling line" to:
protected Variable : '{' ID '.' ID '}' ; // **** this is the line that
puzzles me! ****
then I do not get the warning. If I compare the generated lexer code, there
is one line that differs:
public final void mCHARACTERS() throws RecognitionException {
try {
// D:test.g:25:21: (~ SPECIAL_CHARACTERS )
// D:test.g:25:23: ~ SPECIAL_CHARACTERS
{
/*differ*/ if ( (input.LA(1)>='\u0000' && input.LA(1)<='\t')||(
input.LA(1)>='\u000B' && input.LA(1)<='\uFFFE') ) { // first case (skips \n)
/*differ*/ if ( (input.LA(1)>='\u0000' && input.LA(1)<='\b')||(
input.LA(1)>='\n' && input.LA(1)<='\uFFFE') ) { // second case (skips \t)
input.consume();
}
else {
MismatchedSetException mse =
new MismatchedSetException(null,input);
recover(mse); throw mse;
}
}
}
finally {
}
}
Can anybody explain what my problem is?
Thank you
Claude
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071012/1719ca7b/attachment.html
More information about the antlr-interest
mailing list