[antlr-interest] Multiplication and pointers in C-style language

Wed Apr 9 08:51:42 PDT 2008

I worked with the C grammar before using another approach. The C grammar 
in ANTLR typically builds a symbol type table, and parses stuff 
differently when it recognises a type identifier, compared to when it 
doesn't. This looks like one of those cases. C is not a clean grammar, 
and to parse properly requires semantic predicates to determine which 
path to follow. There is code in the example C parser in ANTLR to do all 
this. Even the ANSI grammar definitions in yacc/lex contain small 
references to using type information to disambiguate. In the end, I came 
to the conclusion that C is basically a nasty language to parse, 
although it has the one virtue that it is not quite as hard to parse as C++.

if you need pointers and multiplication, in C syntax, I suspect you're 
going to need to build a (scoped) type symbol table and semantic 
predicates. Then if the first identifier is a type, you can treat it is 
a pointer, if not, multiplication. As Jim said, the code to do this is 
in the C ANTLR example.

All the best
Stuart

David Olsson wrote:
> Jim Idle wrote:
>>
>> Why not just download the examples tar from the download page and 
>> inspect the ANSI C grammar in there? In fact, you could just steal 
>> the rules if you want.
>>
>> Jim
>>
> Actually, I have been looking quite a lot at the C example but as a 
> novice when it comes to ANTLR and constructing languages I don't 
> really see how and why it works. Perhaps I should rephrase my 
> question; can anyone explain how and why the ANSI C example grammar 
> works with the multiplication and pointer situation described below? :-)
>
> Best regards
> David Olsson
>
>>  
>>
>> *From:* antlr-interest-bounces at antlr.org 
>> [mailto:antlr-interest-bounces at antlr.org] *On Behalf Of *David Olsson
>> *Sent:* Tuesday, April 08, 2008 1:10 PM
>> *To:* antlr-interest at antlr.org
>> *Subject:* [antlr-interest] Multiplication and pointers in C-style 
>> language
>>
>>  
>>
>> I am a novice when it comes to language and compiler construction and 
>> is using ANTLR to create a C-style language which among other things 
>> will support pointers, using standard C syntax (*), and 
>> multiplication. The problem is that ANTLR may potentially be unable 
>> to differentiate between a pointer declaration and a multiplication 
>> statement. Consider the following statement;
>>
>> ID * ID;
>>
>> where /ID /denotes a token corresponding to an identifier. The above 
>> statement can be either a declaration of a pointer to a user declared 
>> type (eg /Object *obj/) or a multiplication of two variables (eg /i * 
>> j/). Does anyone have any input on how to write good grammar to 
>> handle this type of situation?
>>
>> Currently my parser grammar looks like the following (very simplified);
>>
>> block:                    '{' (variableDecl | statement)* '}';
>> variableDecl:             'const'? type ('*'* 'const'?)? ID '=' 
>> expression; ';';
>> statement:                block | expression;
>> expression:               additiveExpression ('=' expression)?;
>> additiveExpression:       multiplicativeExpression (('+' | '-') 
>> multiplicativeExpression)*;
>> multiplicativeExpression: primary (('*' | '/') primary)*;
>> primary:                  literal | ID | '(' expression ')';
>> literal:                  DECIMALLITERAL | REALLITERAL;
>>
>> Thanks a lot for any input!
>>
>> Best regards
>> David Olsson
>>
>
>
> -- 
> This message was scanned by ESVA and is believed to be clean.
> Click here to report this message as spam. 
> <http://antispam.infobal.com/cgi-bin/learn-msg.cgi?id=16AC727F08.A69CF> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080409/abe80fbd/attachment.html