[antlr-interest] Apparent problem with dynamic scopes in C target

Daniel Jensen daniel.jensen at level3.com
Mon Mar 30 16:44:28 PDT 2009


Hi folks,

I'm trying to write a grammar where one token, the construct $<num>, is 
valid only in expressions within a certain context and invalid 
everywhere else.  For example, in the expression

    extract($foo, "(\d+)\s+(\d+)\s+(\d+)", $1 + "/" + $2) + ";" + $3

the parser should accept $1 and $2, but the $3 should generate a syntax 
error.  To do this, I define a scope with one member named in_extract.  
in_extract is initialized to 0 when the first entry in the scope stack 
is created, then when additional entries are created, it inherits 
whatever value the enclosing stack had.  When the second comma in the 
extract(...) argument list is recognized, in_extract is set to 1 in the 
current scope.  Finally, a gated semantic predicate using the value of 
in_extract is used to enable or disable the recognition of $<num>.

This all seems to work just as I hoped when I'm generating a Java 
parser, but when I use the C target, it refuses to accept $1, $2 or $3.  
Any clues as to why this might be?  A somewhat stripped down version of 
the grammar that illustrates the problem follows.


grammar MinExpr;

options {
    output = 'AST';
    language = 'C';
    ASTLabelType = pANTLR3_BASE_TREE;
}

scope ExprScope {
    int in_extract;
}

@members {
    int exprScopeDepth = 0;
}

rule    :       expr EOF!;

expr scope ExprScope;
@init {
    if (exprScopeDepth++ == 0)
        $ExprScope::in_extract = 0;
    else
        $ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
        :       alternative (T_OR^ expr)?;

alternative
        :       value (T_CONCAT^ alternative)?;
       
value scope ExprScope;
@init {
    $ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
        :       T_VARIABLE
        |       T_STRING
        |       function_call

        // T_MATCH is only valid inside the 3rd argument to an extract() 
call.
        |       { $ExprScope::in_extract != 0 }?=> T_MATCH
        |       '('! expr ')'!
        ;

function_call
        :       extractFn
        ;

extractFn scope ExprScope;
@init {
    if (exprScopeDepth++ == 0)
        $ExprScope::in_extract = 0;
    else
        $ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
        :       T_EXTRACT^ '('! expr ','! expr ','!
                    {
                        $ExprScope::in_extract = 1;
                    }
                expr ')'!;

WHITESPACE: (' ' | '\t' | '\n')+ { $channel = HIDDEN; };
T_VARIABLE: '$' LETTER (LETTER|DIGIT)*;
T_MATCH: '$' (DIGIT)+;
T_EXTRACT: 'extract';
T_CONCAT: '+';
T_OR: '||';
T_STRING: ('"' NONDQUOTE '"') | ('\'' NONSQUOTE '\'');

fragment NONDQUOTE: (~'"')*;
fragment NONSQUOTE: (~'\'')*;

fragment LETTER: '_' | 'a'..'z' | 'A'..'Z';
fragment DIGIT: '0'..'9';



Thanks for any help!  Suggestions about better ways to do this are also 
welcome.

Daniel


More information about the antlr-interest mailing list