[antlr-interest] RES: COBOL grammar

Nilo Roberto C Paim nilopaim at gmail.com
Mon Jul 11 13:46:11 PDT 2011


Thanks, Parsiad, for your help.

Indeed the ambiguities are 'solved'... but using your solution, only the
first 'command' of the first 'block' of the first 'paragraph' are parsed!

And I don't have something I can call 'END_BLOCK', as you suggested. This
thing would be a '.', that can be preceded by a 'command' either... and the
problem returns...

What else am I missing?

TIA,
Nilo - Brazil

-----Mensagem original-----
De: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] Em nome de Parsiad Azimzadeh
Enviada em: segunda-feira, 11 de julho de 2011 14:36
Para: antlr-interest at antlr.org
Assunto: Re: [antlr-interest] COBOL grammar

Hi Nilo,

The problem is that a paragraph contains any number of blocks and a block
contains any number of commands (the ambiguity here is that some paragraph
with two commands can be perceived as containing either two blocks each with
one command or a single block with two commands).

The fix is simple, remove the * symbol from the line:
paragraph : ID '.' block* '.';

If multiple blocks holds semantic value, you might need to use a symbol to
denote the end of a block. For example, instead of using the fix above you
could change the block rule to:

block: (command END_BLOCK)*;

-- 
Parsiad Azimzadeh
http://sfu.ca/~paa4

On Mon, Jul 11, 2011 at 9:46 AM, Nilo Roberto C Paim
<nilopaim at gmail.com>wrote:

> Hi all,
>
> I'm facing a problem on my grammar that I don't know how to solve
> (Antlr3.3)...
>
> Let me show you my grammar. Simplified, of course. It's just to show you
my
> trouble.
>
>
>
>
>
> grammar Cobol;
>
> options {
>  language = Java;
> }
>
> program : 'procedure' 'division' '.' section*;
>
> section : ID 'section' '.' paragraph*;
>
> paragraph : ID '.' block* '.';
>
> block : command*;
>
> command: (cmdA | cmdB | cmdC ) '.'?;
>
> cmdA: 'A';
>
> cmdB: 'B';
>
> cmdC: 'C';
>
> fragment Digit : '0'..'9';
>
> fragment Letter : ('a'..'z' | 'A'..'Z');
>
> ID : Letter ( Letter | Digit | '-' )*;
>
>
>
>
>
> Using this grammar, I'm having the following errors and warnings:
>
> warning(200): /Cobol/src/Cobol.g:14:12: Decision can match input such as
> "{'.', 'A'..'C'}" using multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>  |---> ID '.' block* '.';
>
> error(201): /Cobol/src/Cobol.g:14:12: The following alternatives can never
> be matched: 2
>  |---> ID '.' block* '.';
>
> warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such as
> "'B'" using multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>  |---> command*;
>
> warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such as
> "'A'" using multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>  |---> command*;
>
> warning(200): /Cobol/src/Cobol.g:17:5: Decision can match input such as
> "'C'" using multiple alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
>  |---> command*;
>
>
> 4 warnings
>
> 1 error
>
> BUILD FAIL
>
>
>
> My real problem is:
>
> 1) any 'command' can be followed by a '.'
> 2) any sequence of 'command's not followed by '.' forms a 'block'
> 3) wherever I can use a 'command', I can use a 'block'
> 4) the '.' signifies the end of a 'block'
> 5) I can use any number of 'block's to form a 'paragraph'
> 6) I can use any number of 'paragraph's to form a 'section'
> 7) I can have any number of 'section's on a 'program'
>
> Any hints or help about what am I doing wrong? I'm completely stuck on it,
> 'cause I'm a little newbie using Antlr...
>
> TIA,
> Nilo - Brazil
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list