[antlr-interest] Multiplication and pointers in C-style language
Stuart Watt
swatt at infobal.com
Wed Apr 9 08:51:42 PDT 2008
I worked with the C grammar before using another approach. The C grammar
in ANTLR typically builds a symbol type table, and parses stuff
differently when it recognises a type identifier, compared to when it
doesn't. This looks like one of those cases. C is not a clean grammar,
and to parse properly requires semantic predicates to determine which
path to follow. There is code in the example C parser in ANTLR to do all
this. Even the ANSI grammar definitions in yacc/lex contain small
references to using type information to disambiguate. In the end, I came
to the conclusion that C is basically a nasty language to parse,
although it has the one virtue that it is not quite as hard to parse as C++.
if you need pointers and multiplication, in C syntax, I suspect you're
going to need to build a (scoped) type symbol table and semantic
predicates. Then if the first identifier is a type, you can treat it is
a pointer, if not, multiplication. As Jim said, the code to do this is
in the C ANTLR example.
All the best
Stuart
David Olsson wrote:
> Jim Idle wrote:
>>
>> Why not just download the examples tar from the download page and
>> inspect the ANSI C grammar in there? In fact, you could just steal
>> the rules if you want.
>>
>> Jim
>>
> Actually, I have been looking quite a lot at the C example but as a
> novice when it comes to ANTLR and constructing languages I don't
> really see how and why it works. Perhaps I should rephrase my
> question; can anyone explain how and why the ANSI C example grammar
> works with the multiplication and pointer situation described below? :-)
>
> Best regards
> David Olsson
>
>>
>>
>> *From:* antlr-interest-bounces at antlr.org
>> [mailto:antlr-interest-bounces at antlr.org] *On Behalf Of *David Olsson
>> *Sent:* Tuesday, April 08, 2008 1:10 PM
>> *To:* antlr-interest at antlr.org
>> *Subject:* [antlr-interest] Multiplication and pointers in C-style
>> language
>>
>>
>>
>> I am a novice when it comes to language and compiler construction and
>> is using ANTLR to create a C-style language which among other things
>> will support pointers, using standard C syntax (*), and
>> multiplication. The problem is that ANTLR may potentially be unable
>> to differentiate between a pointer declaration and a multiplication
>> statement. Consider the following statement;
>>
>> ID * ID;
>>
>> where /ID /denotes a token corresponding to an identifier. The above
>> statement can be either a declaration of a pointer to a user declared
>> type (eg /Object *obj/) or a multiplication of two variables (eg /i *
>> j/). Does anyone have any input on how to write good grammar to
>> handle this type of situation?
>>
>> Currently my parser grammar looks like the following (very simplified);
>>
>> block: '{' (variableDecl | statement)* '}';
>> variableDecl: 'const'? type ('*'* 'const'?)? ID '='
>> expression; ';';
>> statement: block | expression;
>> expression: additiveExpression ('=' expression)?;
>> additiveExpression: multiplicativeExpression (('+' | '-')
>> multiplicativeExpression)*;
>> multiplicativeExpression: primary (('*' | '/') primary)*;
>> primary: literal | ID | '(' expression ')';
>> literal: DECIMALLITERAL | REALLITERAL;
>>
>> Thanks a lot for any input!
>>
>> Best regards
>> David Olsson
>>
>
>
> --
> This message was scanned by ESVA and is believed to be clean.
> Click here to report this message as spam.
> <http://antispam.infobal.com/cgi-bin/learn-msg.cgi?id=16AC727F08.A69CF>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080409/abe80fbd/attachment.html
More information about the antlr-interest
mailing list