[antlr-interest] C runtime and aggregation in the parser

Richard Thrippleton richard.thrippleton at progress.com
Wed Jul 14 04:03:40 PDT 2010


Nathan Eloe wrote:
> Hello again,
> I'm writing about a very specific problem I'm having with the C runtime.
> One of the restrictions of the grammar I'm writing is that strings may
> contain some specific characters (such as # or %), but other rules have
> these as operators, and as such I can't just make a token to catch all
> strings.  The only way around this I've found has been aggregating
> allowable strings in the parser.
> Example:
> ns_str_agg
>   : nsp=ns_str_part nsap=ns_str_aggp -> STRING[$nsp.text+$nsap.text]
>   | ns_str_part
>   | rw=res_word_str nsap=ns_str_aggp -> STRING[$rw.text+$nsap.text];
> 
> This worked just fine when I was using the java runtime (so I could use
> the debugger and gunit to test my grammar).  When moving to the C
> runtime, I get the following error (and lots of them):
> 
> bashastParser.c: In function 'ns_str_agg':
> bashastParser.c:42343: error: invalid operands to binary + (have
> 'uint8_t *' and 'pANTLR3_STRING')
> 
> I've attached the grammar to this email (I am attempting to recreate the
> Bash grammar).  Is there some way around this or some way to correctly
> do this kind of aggretation with the C runtime?
Anything inside "[ ... ]" of a token constructor is native code (Java or C 
in your case), and all that is done to it by ANTLR is to expand the 
$-prefixed expressions.

In Java you were fine because the Java backend of ANTLR expands 
$something.text to be an expression of type String, and Java overloads the 
operator '+' to work as you'd expect.

In C, the $something.text expressions get expanded to be expressions that 
give you a pointer to an ANTLR3_STRING[1], and C has no idea what to do with 
those when applied to the '+' operator. Look at the functions in 
http://www.antlr.org/api/C/struct_a_n_t_l_r3___s_t_r_i_n_g__struct.html if 
you want to manipulate ANTLR3_STRINGs.

My own preferred approach is to be using a C++ compiler and have a function 
that turns an ANTLR3_STRING into a std::string so I can do things like
	STRING[antlrStr($nsp.text) + antlrStr($nsap.text)]

Richard

[1] - I'm not sure why one of them seems to be being expanded into a 
uint8_t* in one case. I'd strongly encourage looking at the generated C.
-- 
\o/



More information about the antlr-interest mailing list