[antlr-interest] COBOL
Sinan
sinan.karasu at boeing.com
Wed May 29 11:40:57 PDT 2002
Balvinder Singh wrote:
>
> Hi Sinan,
>
> But If I will use your scheme, getting warning
>
> warning:Syntactic predicate ignored for single alternative
>
> what should I refactor?
>
> balvinder
>
> >From: Sinan <sinan.karasu at boeing.com>
> >Reply-To: antlr-interest at yahoogroups.com
> >To: antlr-interest at yahoogroups.com
> >Subject: Re: [antlr-interest] COBOL
> >Date: Wed, 29 May 2002 08:55:03 -0700
> >
> >Balvinder Singh wrote:
> > >
> > > Hi all,
> > >
> > > I'm writing cobol parser only for WORKING STORAGE AREA of data
> >division.
> > > I'm using grammar rule and lexical rule for WORKING STORAGE AREA from VS
> > > COBOL II (http://adam.wins.uva.nl/~x/grammars/vs-cobol-ii/)
> > >
> > > I have converted lexical rule to ANTLR format, but I'm getting conflicts
> >for
> > > some of the rules, rules are as follows :
> > >
> > > Literal : NonNumeric | Numeric
> > > ;
> > >
> > > protected
> > > NonNumeric : '"' ( (~'"') | '"' '"' )* '"'
> > > | '\'' ( (~'\'') | '\'' '\'')* '\''
> > > | ('X' 'x') '"' HexDigits '"'
> > > | ('X' 'x') '\'' HexDigits '\''
> > > ;
> >Factor this:
> >
> >NonNumeric : '"' ( (~'"') | '"' '"' )* '"'
> > | '\'' ( (~'\'') | '\'' '\'')* '\''
> > | ('X' 'x')( '"' HexDigits '"' | '\'' HexDigits '\'')
> > ;
> >
> >
> >
> >
> > > AphabeticUserDefinedWord : (('0'.. '9')+ ('-')*)* ('0' .. '9')* ('A'
> >..
> > > 'Z' 'a' .. 'z') ('A' .. 'Z' 'a' .. 'z' '0' .. '9')* (('-')+ ('A' .. 'Z'
> >'a'
> > > .. 'z' '0' .. '9')+)*
> > > ;
Aha , didn't notice that before..
(1):
AphabeticUserDefinedWord : (('0'.. '9')+ ('-')*)* ('0' .. '9')*
('A'..'Z' 'a' .. 'z') ('A' .. 'Z' 'a' .. 'z' '0' .. '9')* (('-')+ ('A'
.. 'Z' 'a'.. 'z' '0' .. '9')+)*
;
Has a genuine ambiguity....
(('0'.. '9')+ ('-')*)* ('0' .. '9')*
problem is with ('-')*
so you can have
99
satisfied by ('0'.. '9')+
or
(('0'.. '9')+ ('-')*)* ('0' .. '9')*
Don't you really want
(('0'.. '9')+ ('-' ('0' .. '9')*)* )*
in which case the solution would be (in case you still get
ambiguities...)
(('0'.. '9')+ (('-')=> '-' ('0' .. '9')*)* )*
Which probably is not correct either, but then the rule you have has
another ambiguity, mainly
(('0'.. '9')+ ('-')*)* ('0' .. '9')* ('A'..'Z' 'a' .. 'z') ('A' .. 'Z'
'a' .. 'z' '0' .. '9')*
is equivalent to:
(('0'.. '9')+ ('-')*)* ('A' .. 'Z' 'a' .. 'z' '0' .. '9')+
so. we get:
AphabeticUserDefinedWord : (('0'.. '9')+ ('-')*)* ('A' .. 'Z' 'a' .. 'z'
'0' .. '9')+ (('-')+ ('A' .. 'Z' 'a'.. 'z' '0' .. '9')+)*;
but then your rule really collapses to:
(2)
AlphabeticUserDefinedWord : ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9')+ (
('-')+ ('A' .. 'Z' | 'a' .. 'z' | '0' .. '9') );
Check it.
It appears that whatever satisfies (1) , satisfies (2).
It appears that (1) tries to make some tokens start witn numeric, but it
fails...
A9--A9
will skip (('0'.. '9')+ ('-')*)* ('0' .. '9')*
and be parsed ( if it was not ambigous)
9A-9A will be parsed by:
('0' .. '9')* ('A'..'Z' 'a' .. 'z') ('A' .. 'Z' 'a' .. 'z' '0' .. '9')*
(('-')+ ('A' .. 'Z' 'a'.. 'z' '0' .. '9')+)*
So what you said is what you got , it is just that what you got is not
what you meant.
Try to start with rule (2) , and fix it as you go along....
If you specify case insensitive literals , then
AlphabeticUserDefinedWord : ('a' .. 'z' | '0' .. '9')+ ( ('-')+ ('a' ..
'z' | '0' .. '9') );
BTW , do you really do not want the | between literals ??? I am
confused....
Sinan
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list