[antlr-interest] Grammar help
Brian Catlin
BrianC at sannas.org
Tue Mar 16 02:12:10 PDT 2010
In my excitement of not seeing any error messages, I neglected to really
test the parser :-(
I don't get the errors I was getting before, but that is because the
FILE_NAME token is matching everything, I put a simple printf action on the
FILE_NAME token, and it gets called for all input:
DT> @abc.def
Found file name: @abc.def
DT> illegal command
Found file name: illegal command
DT> 'alj;klajjf
Found file name: 'alj;klajjf
Is there a way to make the FILE_NAME token context sensitive so that the
lexer doesn't try to match it unless we're in a rule that wants to find a
file name? I tried making the FILE_NAME token a fragment, but then the
parser failed to recognize anything as valid.
Here's the grammar:
//
// This grammar defines the commands available to the DiskTool (DT) program
//
grammar Commands;
options
{
language = C;
backtrack = true;
memoize = true;
}
@lexer::header
{
#define ANTLR3_INLINE_INPUT_ASCII
}
//+
// Productions
//-
commands
:
(script_command
| dump_command
| show_command
)*;
script_command
: '@'
FILE_NAME
;
dump_command
: DUMP
(dump_struct
| dump_block
| a_file
);
show_command
: SHOW
(structure_nouns
| storage_nouns
| a_file
);
mbr_vbr
: MBR
| VBR
;
block_nouns
: LBN
| LCN
| VBN
| VCN
;
structure_nouns
: MBR
| VBR
;
dump_block
: block_nouns
number
(
(',' number
)
|
(':' number
))?
DRIVE_NAME?
;
dump_struct
: mbr_vbr
('/' qualifier)?
DRIVE_NAME?
;
storage_nouns
: DISK
| VOLUME
;
a_file
: FILE
FILE_NAME
;
number
: DEC_NUMBER
| HEX_NUMBER
;
qualifier
: ALL
| CODE
| TABLE
;
//+
// Tokens
//-
// Verbs
DUMP : 'DUMP';
SHOW : 'SHOW';
// Nouns
DISK : 'DISK';
FILE : 'FILE';
LBN : 'LBN';
LCN : 'LCN';
MBR : 'MBR';
PBN : 'PBN';
VBN : 'VBN';
VBR : 'VBR';
VCN : 'VCN';
VOLUME : 'VOLUME';
// Qualifiers
ALL : 'ALL';
CODE : 'CODE';
TABLE : 'TABLE';
// Miscellaneous tokens
DRIVE_NAME
: LETTER ':'
;
fragment
LETTER : 'A'..'Z';
fragment
DIGIT : '0'..'9';
fragment
HEX_DIGIT : (DIGIT | 'A'..'F');
HEX_NUMBER : '0X' HEX_DIGIT+;
DEC_NUMBER : DIGIT+;
FILE_NAME
: ~('|' | '<' | '>' | '*' | '?' | '\r' | '\n')+ (('\r'? '\n') |
EOF)
{printf("Found file name: \%s\n", GETTEXT()->chars);};
LINE_COMMENT
: '!' ~('\n'|'\r')* (('\r'? '\n') | EOF) {$channel=HIDDEN;}
{printf("Found comment: \%s\n", GETTEXT()->chars);};
WS : (' ' | '\t' | '\r' | '\n')+ {$channel=HIDDEN;};
-----Original Message-----
From: Brian Catlin [mailto:BrianC at sannas.org]
Sent: Tuesday, March 16, 2010 16:18
To: 'antlr-interest at antlr.org'
Subject: RE: [antlr-interest] Grammar help
(Brian slaps head again), "Duh!" Sigh. Sometimes, I really wonder whether
I'm overpaid ;-}
You fixed it!
Thank you very much for your help!!
-Brian
-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Bart Kiers
Sent: Tuesday, March 16, 2010 15:33
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Grammar help
On Tue, Mar 16, 2010 at 8:10 AM, Brian Catlin <BrianC at sannas.org> wrote:
> While that gets rid of those warnings (why don't the warnings print a
> reasonable line number? I would call that a BUG),
Note that the '!' is a valid operator inside your grammar, ANTLR just
assumes that you're building trees. So, you're not doing anything wrong.
But, yes, a warning with the line number of the improper use of rewrite
operators would be nice.
On Tue, Mar 16, 2010 at 8:10 AM, Brian Catlin <BrianC at sannas.org> wrote:
> the fundamental problem
> of being able to parse (or otherwise capture the file name) still exists.
>
> Any ideas?
>
The error message is telling that your FILE_NAME is ambiguous. When matching
one or more characters from:
~('|' | '<' | '>' | '*' | '?')+
then line breaks will also be matched, yet after that, the following could
be matched:
('\r'? '\n')
which has already been "eaten" by the previous part of your rule. You could
fix that by adding line breaks to that first part of your rule, like this:
FILE_NAME : ~('|' | '<' | '>' | '*' | '?'| '\r' | '\n')+ (('\r'? '\n') |
EOF);
Regards,
Bart.
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list