[antlr-interest] Grammar does not match a valid sentence
Harald M. Müller
harald_m_mueller at gmx.de
Mon Dec 3 13:01:10 PST 2007
Edson -
I won't go much deeper into your troubles - but couldn't you at least use
different parentheses for that (as I understand it, new) construct of
"operator with parameters"? E.g.
after[[1,10]]
after<<1,10>>
after{{1,10}}
after/1,10/
after`1,10´
or whatever other languages have as grouping tokens ... HTML has <!-- -->
for comments - how about
after<-- 1,10 -->
;-)
But that may not really help you ...
Regards & good luck!
HArald
_____
From: ed.tirelli at gmail.com [mailto:ed.tirelli at gmail.com] On Behalf Of Edson
Tirelli
Sent: Monday, December 03, 2007 9:53 PM
To: harald_m_mueller at gmx.de; antlr-interest at antlr.org
Cc: Mark Proctor; Michael Neale
Subject: Re: [antlr-interest] Grammar does not match a valid sentence
Harald,
Thanks a lot for your answer. I know it probably took you quite some time
to go through that and answer me, because you captured the exact picture of
how the language is today. Our grammar has several non-LL(*) constructions
that were implemented earlier in its life and now we have to deal with
backward compatibility issues.
Anyway, the change I tried to introduce in the grammar was allowing user
defined identifiers (ID token) followed by an optional [] to be used as an
operator. This is to allow things like:
Event( this after[1,10] anotherEvent )
Where "after" is a user defined operator and 1,10 are parameters to it.
Anyway, I know that introducing an identification token (like '~') makes the
grammar work, but users don't want that.
Event( this ~after[1,10] anotherEvent )
So we will have to either have all possible operators hardcoded in the
grammar (as Lexer tokens) or have the users to accept the addition of an
identification token.
Thanks a lot for your feedback.
Edson
I know that using a token to identify such operators
2007/12/3, Harald M. Müller <harald_m_mueller at gmx.de>:
Edson -
I looked into your grammar and played with it a little ... that language is
more or less a horror story.
Essentially, you can say things like
a == 5 && c != 6
but also
a == 5 && != 6
(without the c) and then there are operators that may be names, so you can
say
a == 5 && c 6
(meaning probably a == 5 && a c 6, where c is some user-defined operator);
and then you can have those brackets as in
a[3] == 5 && b[3] == 6
but an operator can ALSO have brackets, so you can have
a[3] == 5 && b[3] 6
(meaning probably a[3] == 5 && a b[3] 6 - whatever the operator b[3] is
meant to be ...) as well as
a[3] == 5 && b[3] c[2] 6
where now the b[3] is a path and c[2] is the operator.
The grammar tries to find a way thru this by having { backtracking=true; }
options in it - but I fear that hte graphs of the LL(*) analysis and the {
backtracking=true; } don't always work the way they are supposed to: Either
because ANTLR simply won't be able to do everything, or because there is
some tiny flaw in the grammar so that there's really no way to parse some of
the constructs.
When you remove the { backtracking=true; } around line 85 (I edited the file
a few times, so that number may be off), ANTLR tells you
error(211): DRL2.g:88:3: [fatal] rule and_restr_connective has non-LL(*)
decision due to recursive rule invocations reachable from alts 1,2. Resolve
by left-factoring or using syntactic predicates or using backtrack=true
option.
warning(200): DRL2.g:88:3: Decision can match input such as
"DOUBLE_AMPER LEFT_PAREN {LEFT_PAREN, NOT, '=='..'!='}" using multiple
alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
Now, this definitely is a horrible thing - a grammar that is "not at all
LL(*)." ...
I would argue that the only way to "repiar" this machinery is to have only
one level of nested expressions, so that eg. DOUBLE_AMPER is only at one
place in the grammar. The core rule for a simple expression might be
something like
simpleexpression: path? operator value
where path is defined in some LL(k) (or LL(*) way) so that it is possible to
decide whether there is path. One step further, one could try
simpleexpression
: (path)=> path operator value
| operator value
;
But I may also be wrong with these suggestions ... only having the full
thing set up will show what's going on.
Regards & good luck
Harald
_____
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Edson Tirelli
Sent: Monday, December 03, 2007 1:53 PM
To: antlr-interest at antlr.org
Cc: Mark Proctor; Michael Neale
Subject: [antlr-interest] Grammar does not match a valid sentence
All,
I have a grammar that is not matching a valid sentence and I don't
understand exactly why. If anyone can shed some light, I would really
appreciate.
To see the problem:
Using: antlr v3.0 and antlrworks 1.1.4:
1. Load the attached grammar in antlrworks
2. Select the interpreter tab and paste the following sentence:
Address( street[1] == "Low" && street[2] == "High" )
3. Use the rule "fact" as an entry point to parse the sentence above.
The above sentence is a valid sentence, but it is raising a
NoViableAltException in the "and_restr_connective" rule, while the expected
behavior is not to match "and_restr_connective", but go back in the call
hierarchy and to match the "and_constr" rule.
If you remove the square brackets, the sentence will work fine
(matching the "and_constr" rule):
Address( street1 == "Low" && street2 == "High" )
Any idea how can I fix that?
Thanks in advance,
[]s
Edson
--
Edson Tirelli
JBoss Drools Core Development
Office: +55 11 3529-6000
Mobile: +55 11 9287-5646
JBoss, a division of Red Hat @ www.jboss.com
--
Edson Tirelli
JBoss Drools Core Development
Office: +55 11 3529-6000
Mobile: +55 11 9287-5646
JBoss, a division of Red Hat @ www.jboss.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071203/8140f883/attachment.html
More information about the antlr-interest
mailing list