[antlr-interest] help
Greg Lindholm
glindholm at yahoo.com
Wed Jun 5 17:14:13 PDT 2002
I'll try to help where I can.
identifier : qualified_data_name
( "(" subscript ")" )*
( "(" leftmost_character_position ":" (length)? ")" )?;
try.g:125: warning: nondeterminism upon
try.g:125: k==1:"("
try.g:125: between alt 1 and exit branch of block
This first nondeterminism is because when it sees the "(" it can't
tell if this is the start of a subscript or reference modification
(2nd option).
You might be able to add a syntactic predicate that looks for
the ":" to distinguish between the two options.
I think a better solution is to do somthing like this:
identifier : qualified_data_name ( "(" x (":" (x)?) ")" )*
Combine the two options then figure out what you got later.
subscript : ( Integer |
qualified_data_name (( "+" | "-" ) Integer)? |
index_name (( "+" | "-" ) Integer )?)+;
try.g:147: warning: nondeterminism upon
try.g:147: k==1:AlphabeticUserDefinedWord
try.g:147: between alts 2 and 3 of block
The problem here is both qualified_data_name (alt 2) and
index_name (alt 3) can start with AlphabeticUserDefinedWord.
You have lots of problems with this grammer both in the lexer
and the parser.
For instance: The character "9" surrounded with whitespace will match
token rules "Numeric", "LevelNumber", "PicChar", and "Integer".
How is the lexer suppose to know which token is the correct one?
You've hidden the nondeterminisms behind syntactic predicates
so you don't see the warning but it will test in order and
always return the first match "Numeric".
LiteralOrLevelNumberOrAlphabeticUserDefinedWordOrPicCharOrPunctuationOrCurrencyOrIntegerOrZeroOrWSOrComment
:
(NonNumeric )=>NonNumeric {$setType(Literal); }
| (Numeric)=>Numeric {$setType(Literal); }
| (LevelNumber)=>LevelNumber {$setType(LevelNumber); }
| (AlphabeticUserDefinedWord)=>AlphabeticUserDefinedWord
{$setType(AlphabeticUserDefinedWord);}
| (PicChar)=>PicChar {$setType(PicChar);}
| (Punctuation)=>Punctuation {$setType(Punctuation);}
| (Currency)=>Currency {$setType(Currency);}
| (Integer)=>Integer {$setType(Integer);}
| (Zero)=>Zero {$setType(Zero);}
| (WS)=>WS {$setType(WS);}
| (Comment)=>Comment {$setType(Comment);}
;
Your biggest problem is you've combined syntactic and semantic
rules together. Parsers can usually only figure out the
syntactic rules not the sematic rules. What is the difference
between a "Integer" and a "LevelNumber"? Syntactically many
input conditions will match both. These are semantic differences
not syntactic.
Antlr has some ability to add some semantic checks but these
are usually better left to a later phase of processing.
IMHO: You really need to start over and build your grammer thinking
from the syntactic point-of-view and leave the semantic rules to a
later phase.
Good Luck. Cobol is a very hard grammer both to lex and parse.
Greg
--- Balvinder Singh <bals1978 at hotmail.com> wrote:
> Hi all
>
> I'm attaching a file which contains lexical rules and parsing rule.
> For parsing rule I'm getting conflicts.Conflicts are not between the
> rules,
> but they are in the rule.Lexical rules are OK.
>
> I'm stucked in this.., any help will be usefule.
>
>
> balvinder
>
>
>
> _________________________________________________________________
> Join the worlds largest e-mail service with MSN Hotmail.
> http://www.hotmail.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
>
> ATTACHMENT part 2 application/octet-stream name=try.g
__________________________________________________
Do You Yahoo!?
Yahoo! - Official partner of 2002 FIFA World Cup
http://fifaworldcup.yahoo.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list