[antlr-interest] Problem with EBNF

Morten Olav Hansen mortenoh at gmail.com
Mon Dec 6 01:31:42 PST 2010


Hi!

Yes, thats basically what i ended up doing.

regionDecl
	scope {
		java.util.Map<String, String> stateMap;
		boolean hasInitial;
		boolean hasFinalState;
		boolean hasHistory;
		boolean hasDeepHistory;
	}
	@init {
		$regionDecl::stateMap = new java.util.LinkedHashMap<String, String>();
		$regionDecl::hasInitial = false;
		$regionDecl::hasFinalState = false;
		$regionDecl::hasHistory = false;
		$regionDecl::hasDeepHistory = false;
	}
	:	'region' ID=Identifier? '{'
			(
					psinitial=psinitialDecl {
						if($regionDecl::hasInitial)
							throw new RuntimeException("Can only have one psinitial in a region.");

						$regionDecl::hasInitial = true;
					}
				|	finalstate=finalstateDecl {
						if($regionDecl::hasFinalState)
							throw new RuntimeException("Can only have one finalstate in a region.");

						$regionDecl::hasFinalState = true;				
					}
				|	pshistory=pshistoryDecl {
						if($regionDecl::hasHistory)
							throw new RuntimeException("Can only have one pshistory in a region.");

						$regionDecl::hasHistory = true;
					}
				|	psdeephistory=psdeephistoryDecl {
						if($regionDecl::hasDeepHistory)
							throw new RuntimeException("Can only have one psdeephistory in
a region.");

						$regionDecl::hasDeepHistory = true;				
					}
				|	states+=stateDecl
				|	transitions+=transitionDecl
			)*
		'}' ';'?

Now I just need to find out how to get the current line, and at what
position in that line the error is (to get a more reasonable error
message), but I'm sure there is some kind of api call for that.

Regards,
Morten

On Mon, Dec 6, 2010 at 12:17 AM, John B. Brodie <jbb at acm.org> wrote:
> Greetings!
>
> On Sun, 2010-12-05 at 21:44 +0100, Morten Olav Hansen wrote:
>> Hi, thanks for your reply.
>>
>> Sorry if it wasn't clear, but the ordering does not matter.
>>
>> The rule you proposed is basically what I had before, and as you say,
>> it allows any number of ps* keywords.
>>
>> What I meant by "rule combinations" (not sure what else to call them)
>> is writing out all possible combinations.
>>
>> (... psinitial=psinitialDecl ... finalstate=finalstateDecl? ...)
>> (... finalstate=finalstateDecl ... psinitial=psinitialDecl? ...), etc.
>>
>> But maybe theres no elegant way of doing this. Maybe adding some
>> region scoped variables would be better, and setting hasInitial, etc,
>> and doing the check in the grammar.
>
> i think you should not try to do this in the parser, but rather keep
> your checking in a separate semantic pass over the AST.
>
> however what you want to do is possible ---- i think, haven't actually
> tried it ---- by using Gated Semantic Predicates. not sure what sort of
> error messages are produced when using them tho --- a reason to keep a
> separate checking pass.
>
> basically use a bunch of flags as you speculate.
>
> something like this (untested):
>
> rule_name_goes_here :
> @init{
>    boolean initPresent = false;
>    boolean finalPresent = false;
>    boolean histPresent = false;
>    boolean deepPresent = false;
>  }
>  'region' ID=identifier? '{' (
>        ( { !initPresent }?=> i=psinitialDecl { initPresent=true; } )
>      | ( { !finalPresent }?=> f=finalStateDecl { finalPresent=true; } )
>      | ( { !histPresent }?=> h=pshistoryDecl { histPresent=true; } )
>      | ( { !deepPresent )?=> d=dpsdeephistoryDecl {deepPresent=true;} )
>      | s+=stateDecl
>      | t+=transitionDecl
>    )* '}' ';'?
>  ;
>
> notice that there are no ? nor * operators inside the outer ( )* loop.
>
> the ?'s are unnecessary because the Gated Predicates deal with the zero
> or one semantics.
>
> no *'s because a * loop inside another * loop introduces syntactic
> ambiguity (e.g. (foo*)* is ambiguous).
>
> hope this helps....
>   -jbb
>
>
>> On Sun, Dec 5, 2010 at 9:22 PM, Kevin J. Cummings
>> <cummings at kjchome.homeip.net> wrote:
>> > On 12/05/2010 12:09 PM, Morten Olav Hansen wrote:
>> >> Hi!
>> >>
>> >> I have a problem with my EBNF I was hoping for a little help with. I
>> >> have a block in my grammar that can contain certain keywords
>> >> zero-or-one times, and other keywords zero-or-many times. My current
>> >> solution is to enable every keyword to be zero-or-many, and then let
>> >> my semantic checker deal with the problem. But I was hoping to solve
>> >> it already on the grammar side, if possible.
>> >>
>> >> The basic block looks like this:
>> >>
>> >>       :       'region' ID=Identifier? '{'
>> >>                       (
>> >>                               psinitial=psinitialDecl?
>> >>                               finalstate=finalstateDecl?
>> >>                               pshistory=pshistoryDecl?
>> >>                               psdeephistory=psdeephistoryDecl?
>> >>                               states+=stateDecl*
>> >>                               transitions+=transitionDecl*
>> >>                       )
>> >>               '}' ';'?
>> >
>> > Why doesn't this work for you?  There is an implied ordering here as in:
>> > if finalstateDecl appears, it must be after the psinitialDecl, and
>> > likewise for the rest.  All stateDecls must appear after any
>> > psinitialDecl, finalstateDecl, pshistoryDecl, psdeephistoryDecl, and
>> > before any transitionsDecls, and all transitionDecls mus appear at the
>> > end after everything else.
>> >
>> > >From what I can see, the ()'s above are completely optional in your
>> > grammar and only are provided for grouping, which in this case, to me,
>> > seems unnecessary.  Did I miss something?
>> >
>> >> And the only solution I have come up with, is to generate every
>> >> possible variant of this grammar, which is quite ugly.
>> >>
>> >> What would be nice, would be something like this:
>> >>
>> >>       :       'region' ID=Identifier? '{'
>> >>                       (
>> >>                               psinitial=psinitialDecl?
>> >>                               finalstate=finalstateDecl?
>> >>                               pshistory=pshistoryDecl?
>> >>                               psdeephistory=psdeephistoryDecl?
>> >>                               states+=stateDecl*
>> >>                               transitions+=transitionDecl*
>> >>                       )*
>> >>               '}' ';'?
>> >
>> > How about this:
>> >
>> >        :       'region' ID=identifier? '{'
>> >                        (
>> >                                psinitial=psinitalDecl
>> >                        |       finalstate=finalstateDecl
>> >                        |       pshistory-pshistoryDecl
>> >                        |       psdeephistory=psdeephistoryDecl
>> >                        |       states+=stateDecl
>> >                        |       transitions+=transitionDecl
>> >                        )*
>> >                '}' ';'?
>> >
>> > But this does not enforce the implied ordering of your first example,
>> > and would allow any number of these to appear in any order.
>> >
>> > You would also have to keep track of (deal with) whether
>> > psinitial/finalstate/pshistory/psdeephistory appear more than once, and
>> > make sure that your states and transitions collect properly.
>> >
>> >> (with * at the end). And for every match to one of the zero-or-one
>> >> rule, it would take it "away", so it can not be matched again. But
>> >> this does not work.
>> >
>> > If you move your ? & * from inside the ()'s to outside, you will want to
>> > remove them from the inside.  You could do this, but it will not enforce
>> > the implied ordering that I see in the first example.
>> >
>> >> Any suggestions on how to solve this? If I have to end up with every
>> >> possible rule combination, then I would probably be better of just
>> >> doing it in the semantic checker as I was doing.
>> >
>> > I don't see any "rule combinations" since the original rule enforces and
>> > ordering to the rules that you probably want to keep....
>> >
>
>
>


More information about the antlr-interest mailing list