[antlr-interest] Parsing poorly terminated IF statements
Dean Shumsheruddin
DShumsheruddin at rocketsoftware.com
Fri Mar 11 06:32:51 PST 2011
Hi Folks,
I'm using ANTLR to parse an old Fortran-like language with poorly terminated if statements.
Here is a simplified version of a block in the language. I've just added the indentation to show the structure:
print 1
if
print 2
else
print 3
endif
if
print 4
if
print 5
do
print 6
if
print 7
next
print 8
If constructs may be terminated by 'endif', a new 'if' construct, or the end of the current block.
If statements cannot be nested except via a do-next construct. Every 'do' is terminated by matching 'next'.
Here is a simplified version of the grammar I'm using:
block : command* ;
command : (ifcom)=> ifcom
| print
| docom
;
ifcom : IF NL noifblock (ELSE NL noifblock)? (ENDIF NL)? ;
noifblock : noifcommand* ;
noifcommand : print | docom ;
print : PRINT NL ;
docom : DO NL block NEXT NL ;
// Lexer Rules
IF : 'if' ;
ELSE : 'else' ;
ENDIF : 'endif' ;
PRINT : 'print' ;
DO : 'do' ;
NEXT : 'next' ;
INT : '0'..'9'+ ;
NL : '\n' ;
WS : (' ' |'\n' |'\r' )+ {skip();} ;
It generates warnings:
[14:15:38] warning(200): if.g:37:4: Decision can match input such as "DO" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
[14:15:38] warning(200): if.g:37:4: Decision can match input such as "PRINT" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
but seems to work because of the greedy ifcom rule. Can anyone suggest a better way of doing it?
Thanks for your help.
Dean
More information about the antlr-interest
mailing list