[antlr-interest] Ambiguous parse tree generated

Gerald Rosenberg gerald at certiv.net
Tue Oct 30 21:06:31 PDT 2012


Was not intending to compare v3 with v4 for any purpose other than to 
point out that the AST is unambiguous whereas the parse tree is 
ambiguous with respect to element ordering.  I guess time will tell 
whether for any rule of the form

A : ( B ( C )? )+ ;

getting a list of B and a list of C -- and loosing the mutual 
correlation information -- is ultimately that trivial. Decomposition 
gets you

A  : ( B C )+  # AltBC
     |  B+         # AltB
     ;

With more elements the decomposition -- both the rule and the collection 
of contexts -- becomes more complex.  The essential expressiveness of 
the natural/semantic representation of the rule is obscured.

I have trouble remembering what I had for breakfast, so maintaining code 
more than a week old requires every consistency and clarifying aid 
possible -- with time you will wind up having the same problem, too ;)

Sam raised the issue of performance.  I did not.

On 10/30/2012 7:46 PM, Jim Idle wrote:
> However AST rewrites in v3 were very slow, whereas rule decomposition
> results in small 'methods'/collections of logic, which are inlined by
> compilers and JITs. I think it is more of a matter of adjusting to v4 than
> worrying about v3 comparisons to be honest :)
>
> Jim
>
> -----Original Message-----
> From: Gerald Rosenberg [mailto:gerald at certiv.net]
> Sent: Wednesday, October 31, 2012 4:04 AM
> To: jimi at temporal-wave.com
> Cc: ANTLR-Interest Interest
> Subject: Re: [antlr-interest] Ambiguous parse tree generated
>
> Yes, and that is the work around I am using now.  Sorry if I was not
> clear.
>
> Resorting to rule decomposition unfortunately greatly increases the number
> of enter/exits and the depth of what was, in v3, AST rewrites.
> Was hoping I was missing some way to mark or label the rule elements to
> remove the ambiguity.
>
> On 10/30/2012 1:06 AM, Jim Idle wrote:
>> At a guess:
>>
>> style : di* hi*;
>>
>> di: Dot Identifier ;
>> hi: Hash Identifier;
>>
>> in other words did you try creating rules for the semantically
>> distinct pieces?
>>
>> Jim
>>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org
>> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Gerald
>> Rosenberg
>> Sent: Tuesday, October 30, 2012 3:59 PM
>> To: antlr-interest
>> Subject: [antlr-interest] Ambiguous parse tree generated
>>
>> I have a rule
>>
>> style  :  ( Dot Identifier )* ( Hash Identifier )* ;
>>
>> AntlrV4 generates a context with a list of Dot, a list of Hash, and a
>> list of Identifier.  While both Identifier's are syntactically
>> identical, they are semantically distinct.  In this particular case,
>> the list of Dot can be used to partition the list of Identifier.
>>
>> However, if I change the rule to the preferred form
>>
>> style  :   ( Dot Identifier ( Hash Identifier )? )*  ;
>>
>> I still get the same single lists of Dot, Hash, and Identifier. Only
>> way to determine the Dot vs Hash associated Identifier's is by
>> inspection of the token stream?  Adding labels did not change the
> generated code.
>> What am I missing?
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>




More information about the antlr-interest mailing list