[antlr-interest] Disambiguation problem

Mon Nov 12 12:56:44 PST 2007

Hi,

I'm not sure how I am supposed to implement the following section of the
C# standard (adapted do ANTLR syntax):

"Cast expressions:

A cast_expression is used to explicitly convert an expression to a given
type.

cast_expression
   :    '( ' type ')'unary_expression
   ;

A cast_expression of the form (T)E, where T is a type and E is a
unary_expression, performs an explicit conversion (§6.2) of the value of
E to type T. If no explicit conversion exists from E to T, a
compile-time error occurs. Otherwise, the result is the value produced
by the explicit conversion. The result is always classified as a value,
even if E denotes a variable.

The grammar for a cast_expression leads to certain syntactic
ambiguities. For example, the expression (x)–y could either be
interpreted as a cast_expression (a cast of –y to type x) or as an
additive_expression combined with a parenthesized_expression (which
computes the value x – y).

To resolve cast_expression ambiguities, the following rule exists: A
sequence of one or more tokens (§2.3.3) enclosed in parentheses is
considered the start of a cast_expression only if at least one of the
following are true:

• The sequence of tokens is correct grammar for a type, but not for an
expression.

• The sequence of tokens is correct grammar for a type, and the token
immediately following the closing parentheses is the token “~”, the
token “!”, the token “(”, an identifier (§2.4.1), a literal (§2.4.4), or
any keyword (§2.4.3) except as and is.

The term “correct grammar” above means only that the sequence of tokens
must conform to the particular grammatical production. It specifically
does not consider the actual meaning of any constituent identifiers. For
example, if x and y are identifiers, then x.y is correct grammar for a
type, even if x.y doesn’t actually denote a type.

>From the disambiguation rule it follows that, if x and y are
identifiers, (x)y, (x)(y), and (x)(-y) are cast_expressions, but (x)-y
is not, even if x identifies a type. However, if x is a keyword that
identifies a predefined type (such as int), then all four forms are
cast_expressions (because such a keyword could not possibly be an
expression by itself)."

So how I am supposed to translate this into ANTLR syntax? The first
point seems to require this: "(OPEN_PARENS type)=> cast_expression". The
second point wants to test what is after the the closing parenthesis,
which I'm also not sure, how to test that - do I have create a function
which scans the input while balancing parenthesis? And I don't
understand why the first point isn't a superset of the second point, as
I don't see why the first point may not be true when the second is.

I've attached the grammar in question. Thanks in advance for any help!

Johannes

-------------- next part --------------
A non-text attachment was scrubbed...
Name: CSharp3ParserTest.zip
Type: application/octetstream
Size: 9094 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20071112/fdd73766/attachment.bin