[antlr-interest] help

Greg Lindholm glindholm at yahoo.com
Wed Jun 5 17:14:13 PDT 2002


I'll try to help where I can. 

identifier : qualified_data_name 
   ( "(" subscript ")" )* 
   ( "(" leftmost_character_position ":" (length)? ")" )?;

try.g:125: warning: nondeterminism upon 
try.g:125: 	k==1:"(" 
try.g:125: 	between alt 1 and exit branch of block 

This first nondeterminism is because when it sees the "(" it can't 
tell if this is the start of a subscript or reference modification 
(2nd option). 
You might be able to add a syntactic predicate that looks for 
the ":" to distinguish between the two options. 
I think a better solution is to do somthing like this: 

identifier : qualified_data_name ( "(" x (":" (x)?) ")" )* 

Combine the two options then figure out what you got later. 



subscript : ( Integer | 
              qualified_data_name (( "+" | "-" ) Integer)? | 
              index_name (( "+" | "-" ) Integer )?)+;  

try.g:147: warning: nondeterminism upon 
try.g:147: 	k==1:AlphabeticUserDefinedWord 
try.g:147: 	between alts 2 and 3 of block 

The problem here is both qualified_data_name (alt 2) and 
index_name (alt 3) can start with AlphabeticUserDefinedWord. 


You have lots of problems with this grammer both in the lexer 
and the parser.

For instance: The character "9" surrounded with whitespace will match
token rules "Numeric", "LevelNumber", "PicChar", and "Integer".
How is the lexer suppose to know which token is the correct one?
You've hidden the nondeterminisms behind syntactic predicates
so you don't see the warning but it will test in order and
always return the first match "Numeric".

LiteralOrLevelNumberOrAlphabeticUserDefinedWordOrPicCharOrPunctuationOrCurrencyOrIntegerOrZeroOrWSOrComment
:
	(NonNumeric )=>NonNumeric {$setType(Literal); }
	| (Numeric)=>Numeric {$setType(Literal); }
	| (LevelNumber)=>LevelNumber {$setType(LevelNumber); }
 	| (AlphabeticUserDefinedWord)=>AlphabeticUserDefinedWord
                       {$setType(AlphabeticUserDefinedWord);}
	| (PicChar)=>PicChar {$setType(PicChar);}
	| (Punctuation)=>Punctuation {$setType(Punctuation);}
	| (Currency)=>Currency {$setType(Currency);}
	| (Integer)=>Integer {$setType(Integer);}
	| (Zero)=>Zero {$setType(Zero);}
	| (WS)=>WS {$setType(WS);}
	| (Comment)=>Comment {$setType(Comment);}
	;

Your biggest problem is you've combined syntactic and semantic 
rules together.  Parsers can usually only figure out the
syntactic rules not the sematic rules.  What is the difference
between a "Integer" and a "LevelNumber"?  Syntactically many
input conditions will match both. These are semantic differences
not syntactic.

Antlr has some ability to add some semantic checks but these
are usually better left to a later phase of processing.

IMHO: You really need to start over and build your grammer thinking
from the syntactic point-of-view and leave the semantic rules to a
later phase.

Good Luck. Cobol is a very hard grammer both to lex and parse.

Greg





--- Balvinder Singh <bals1978 at hotmail.com> wrote:
> Hi all
> 
> I'm attaching a file which contains lexical rules and parsing rule.
> For parsing rule I'm getting conflicts.Conflicts are not between the
> rules, 
> but  they are in the rule.Lexical rules are OK.
> 
> I'm stucked in this.., any help will be usefule.
> 
> 
> balvinder
> 
> 
> 
> _________________________________________________________________
> Join the world’s largest e-mail service with MSN Hotmail. 
> http://www.hotmail.com
> 
>  
> 
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/ 
> 
> 

> ATTACHMENT part 2 application/octet-stream name=try.g



__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list