[antlr-interest] Ambiguous parse tree generated

Gerald Rosenberg gerald at certiv.net
Tue Oct 30 18:35:51 PDT 2012


Yes, its that "in the order they were parsed" that is the issue. That 
would imply a list of lists per context to preserve ordering. Kinda what 
I was hoping for.

The problem with rule decomposition is that it non-trivially increases 
the complexity of processing the parse tree.  For example, the Java 
grammar is less than about 30 rules.  Yet the number of distinct 
contexts, factoring in decomposition and use of the # alt rule markers 
(including refactoring to ensure that the alts marked are all outer), is 
very nearly 200.  Not a desirable direction.

In the rules that I presented, the elements are all terminals, so 
examining the token stream is just a single look back.  Changing the 
lexer to produce Identifier : DotIdentifier | HashIdentifier ; tokens 
would also work.

A more general solution would require a token to keep a reference to its 
enclosing scope, including its prior peer element and its closure.  
Should be able to do that using custom tokens and actions to explicitly 
capture scope state in the initial parse.  Any reason this approach 
would not work/would not be desirable?

On 10/30/2012 4:07 PM, Sam Harwell wrote:
> In V3, a rewrite with multiple root operators behaves like individual rules in v4's parse trees. While the nodes in an AST produced by a V3 rule can have a complicated shape, the node created for a V4 rule is always a single node for the rule with a flat list of children (rule contexts and terminals) in the order they were parsed. Rule decomposition is the way to give your trees the shape you want, and really shouldn't cause a performance problem in v4.
>
> Sam
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Gerald Rosenberg
> Sent: Tuesday, October 30, 2012 3:04 PM
> To: jimi at temporal-wave.com
> Cc: ANTLR-Interest Interest
> Subject: Re: [antlr-interest] Ambiguous parse tree generated
>
> Yes, and that is the work around I am using now.  Sorry if I was not clear.
>
> Resorting to rule decomposition unfortunately greatly increases the number of enter/exits and the depth of what was, in v3, AST rewrites.
> Was hoping I was missing some way to mark or label the rule elements to remove the ambiguity.
>
>
>




More information about the antlr-interest mailing list