[antlr-interest] Question about preserving spaces in quoted strings

Penningroth, Mark mpenningroth at cincom.com
Thu Jul 2 12:27:47 PDT 2009


I am still confused.

Everything I try seems to break something else.

I have a grammar and a tree grammar.

When I print out the tree I see the whitespace that is embedded in my
string.

I can't figure out what I have to do either in the tree construction, or
the tree walker to get to the hidden token channel.

Here is a snippet of the grammar that produces the tree grammar:


mdx_statement
	:	
	'MDX(' mdxs=mdx_stmt ')' -> ^(MDX_FUNC $mdxs)
	;
	
mdx_stmt 
	:	 dqcon -> dqcon;
dqcon
	: '"'
	(  (~'"')=>
	   (   ('\\')=>'\\'.
	   | .
	   )
	 )*
	 '"';
WS : (' '|'\t'|'\n'|'\r')+ {$channel=HIDDEN;} ;

When I parse something like:

MDX("a g g g g g g gggggggggggggggggggggggggx")

And then print the tree I see this:

(MDX_FUNC " a g g g g g g gggggggggggggggggggggggggx ")

In my tree grammar I have this (I left out some things, but you get the
idea):

@after {
   statements.Add($select_statement::stmt);
   if ($mdxs.text != null) {
      $select_statement::stmt.TheMDX = $mdxs.text;
   }
}
	^(MDX_FUNC mdxs=mdx_stmt)
	;
mdx
	: dqcon 
	;

I get an empty string.  I know there is something I a missing.  In the
debugger I see the input text.  Is there a simple way I can get the
string representation of the node?  

Thanks,
  Mark
-----Original Message-----
From: John B. Brodie [mailto:jbb at acm.org] 
Sent: Thursday, July 02, 2009 11:54 AM
To: Penningroth, Mark
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Question about preserving spaces in quoted
strings

Greetings!

On Thu, 2009-07-02 at 10:35 -0400, Penningroth, Mark wrote:
> I have the following in my grammar:
>
> sqcon
>             : '\'' ( options {greedy=false;} : .)* '\''
> ;
>
> The intent is to get a single text node with everything between the
> sing quotes.
>
> When I parse 'Los Angeles'  I lose the space .

your sqcon rule is a Parser rule (because it begins with a lower case
letter).

have you tried making it a Lexer rule? by up-casing (at least) the first
letter.

and note that when you make it into a Lexer rule, the text of the token
will include the leading and trailing quotes, so you may need to
substring them away.

Hope this helps
   -jbb




More information about the antlr-interest mailing list