[antlr-interest] RES: COBOL grammar

George S. Cowan cowang at comcast.net
Wed Jul 13 05:11:09 PDT 2011


Nilo,

OK, I've found a message with your grammar and here's another suggestion. 

Drop the optional period after a command so that a period always ends a
block. (Later, when you include building your AST, you must make sure that
your AST correctly ends an if-statement at the end of a block.) I think you
also want to require a block to contain a by using a + instead of an *. And
you will still need to check for the preceding period when a paragraph
begins. So here is a suggested direction for your grammar (untested).


grammar Cobol;

options {
 language = Java;
}

program : 'procedure' 'division' '.' section*;

section : ID 'section' '.' paragraph*;

paragraph : ID { ((Token)input.LT(-1)).getText().equals(".") }? '.' block* ;

block : command+ '.' ;

command: (cmdA | cmdB | cmdC ) ;

cmdA: 'A';

cmdB: 'B';

cmdC: 'C';

fragment Digit : '0'..'9';

fragment Letter : ('a'..'z' | 'A'..'Z');

ID : Letter ( Letter | Digit | '-' )*;

WS
    :   (    ' '
        |    '\r'
        |    '\t'
        |    '\u000C'
        |    '\n'
        )
            {$channel=HIDDEN;}
    ;

Good luck,
George

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Nilo Roberto C Paim
> Sent: Monday, July 11, 2011 4:46 PM
> To: 'Parsiad Azimzadeh'; antlr-interest at antlr.org
> Subject: [antlr-interest] RES: COBOL grammar
> 
> Thanks, Parsiad, for your help.
> 
> Indeed the ambiguities are 'solved'... but using your solution, only
> the
> first 'command' of the first 'block' of the first 'paragraph' are
> parsed!
> 
> And I don't have something I can call 'END_BLOCK', as you suggested.
> This
> thing would be a '.', that can be preceded by a 'command' either... and
> the
> problem returns...
> 
> What else am I missing?
> 
> TIA,
> Nilo - Brazil
> 
> -----Mensagem original-----
> De: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] Em nome de Parsiad Azimzadeh
> Enviada em: segunda-feira, 11 de julho de 2011 14:36
> Para: antlr-interest at antlr.org
> Assunto: Re: [antlr-interest] COBOL grammar
> 
> Hi Nilo,
> 
> The problem is that a paragraph contains any number of blocks and a
> block
> contains any number of commands (the ambiguity here is that some
> paragraph
> with two commands can be perceived as containing either two blocks each
> with
> one command or a single block with two commands).
> 
> The fix is simple, remove the * symbol from the line:
> paragraph : ID '.' block* '.';
> 
> If multiple blocks holds semantic value, you might need to use a symbol
> to
> denote the end of a block. For example, instead of using the fix above
> you
> could change the block rule to:
> 
> block: (command END_BLOCK)*;
> 
> --
> Parsiad Azimzadeh
> http://sfu.ca/~paa4
> 
> On Mon, Jul 11, 2011 at 9:46 AM, Nilo Roberto C Paim
> <nilopaim at gmail.com>wrote:
> 
> > Hi all,
> >
> > I'm facing a problem on my grammar that I don't know how to solve
> > (Antlr3.3)...
> >
> > Let me show you my grammar. Simplified, of course. It's just to show
> you
> my
> > trouble.
> >
> >
> >
> >
> >
> > grammar Cobol;
> >
> > options {
> >  language = Java;
> > }
> >
> > program : 'procedure' 'division' '.' section*;
> >
> > section : ID 'section' '.' paragraph*;
> >
> > paragraph : ID '.' block* '.';
> >
> > block : command*;
> >
> > command: (cmdA | cmdB | cmdC ) '.'?;
> >
> > cmdA: 'A';
> >
> > cmdB: 'B';
> >
> > cmdC: 'C';
> >
> > fragment Digit : '0'..'9';
> >
> > fragment Letter : ('a'..'z' | 'A'..'Z');
> >
> > ID : Letter ( Letter | Digit | '-' )*;
> >
> >
> >
> >
> >
> > Using this grammar, I'm having the following errors and warnings:
> >
> > warning(200): /Cobol/src/Cobol.g:14:12: Decision can match input such
> as
> > "{'.', 'A'..'C'}" using multiple alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> >  |---> ID '.' block* '.';
> >
> > error(201): /Cobol/src/Cobol.g:14:12: The following alternatives can
> never
> > be matched: 2
> >  |---> ID '.' block* '.';
> >
> > warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such
> as
> > "'B'" using multiple alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> >  |---> command*;
> >
> > warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such
> as
> > "'A'" using multiple alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> >  |---> command*;
> >
> > warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such
> as
> > "'C'" using multiple alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> >  |---> command*;
> >
> >
> > 4 warnings
> >
> > 1 error
> >
> > BUILD FAIL
> >
> >
> >
> > My real problem is:
> >
> > 1) any 'command' can be followed by a '.'
> > 2) any sequence of 'command's not followed by '.' forms a 'block'
> > 3) wherever I can use a 'command', I can use a 'block'
> > 4) the '.' signifies the end of a 'block'
> > 5) I can use any number of 'block's to form a 'paragraph'
> > 6) I can use any number of 'paragraph's to form a 'section'
> > 7) I can have any number of 'section's on a 'program'
> >
> > Any hints or help about what am I doing wrong? I'm completely stuck
> on it,
> > 'cause I'm a little newbie using Antlr...
> >
> > TIA,
> > Nilo - Brazil
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address



More information about the antlr-interest mailing list