[antlr-interest] More on TokenWithIndex

Monty Zukowski monty at codetransform.com
Wed Oct 27 08:00:27 PDT 2004


On Oct 26, 2004, at 9:20 PM, Paul J. Lucas wrote:

>
> 	I need to preserve all original tokens.  I've been looking at
> 	Terrence's TokenWithIndex approach.  Consider this:
>
> 		declareThing
> 		    : DECLARE!
> 		        ( baseURIDecl
> 		        | functionDecl
> 		        | // ...
> 		        )
> 		    ;
>
> 		baseURIDecl
> 		    : b:BASE_URI^ uri=stringLiteral
> 		        {
> 			    #b.setType( BASE_URI_DECL );
> 		        }
> 		    ;
>
> 	i.e., this language has a bunch of "declare" statements each of
> 	which begin with the keyword "declare".  Not surprisingly,
> 	"declare" has been left-factored.
>
> 	I want each token in the generated AST to have min/max indicies
> 	into a list of all tokens, but I want *all* the min/max to
> 	include factored tokens as well: in this case, I want the min
> 	index for BASE_URL_DECL to be that of the DECLARE.
>
> 	A practicle application of this (i.e., why I want this) is to do
> 	something that many IDE editors do: if there is an error in a
> 	statement, I want to underline the entire statement with a red
> 	squiggly line.  For the case at hand, that includes "declare".
>
> 	As written, Terrence's solution will not include any left-
> 	factored token indicies.  So the question is: what's a good way
> 	to get what I want?
>
> 	One way is to somehow pass the left-factored token "down" to the
> 	other rules that can then obtain its index and set their min
> 	accordingly.
>
> 	A similar problem also occurs with discarded tokens, e.g.,
> 	'(' and ')' in parenthesizedExpr.  If you were to have:
>
> 		(3 + 4)
> 		0123456
>
> 	I would want the min/max for '+' to be 0/6 and not 1/5.  For
> 	this case, one could set the min/max explicitly easily since
> 	it's all in the same fule, i.e.:
>
> 		parenthesizedExpr
> 		    : l:LPAREN! e:expr r:RPAREN!
> 		    	{
> 			    #e.setMinMax( l.getMin(), r.getMax() );
> 			}
> 		    ;
>
> 	(Or something like that.)  But is there something better/easier?
>
> 	- Paul
>

Well, for your DECLARE, you could have DECLARE^ instead of DECLARE!  
Then I suppose your line squiggler could have a set of node types to 
bump up to when squiggling.  In your tree when you find a BASE_URI with 
a problem you will pop up your doubly linked tree until you meet a type 
in the squiggle set and then use that as the root to compute the 
min/max.

Another approach would be to keep some sort of stack of "interesting" 
regions.  Then upon meeting DECLARE you push a squiggle bounds object 
onto the stack, but of course you only have the beginning for now.  Set 
the end bound when you get it.  Then later when the tree is done you 
could migrate your correct bounds into the tree if you want it in the 
tree, otherwise just keep that information as referenced squiggle 
bounds objects that you remember to copy around when manipulating the 
tree.

Monty



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list