[antlr-interest] Apparent problem with dynamic scopes in C target
Daniel Jensen
daniel.jensen at level3.com
Mon Mar 30 16:44:28 PDT 2009
Hi folks,
I'm trying to write a grammar where one token, the construct $<num>, is
valid only in expressions within a certain context and invalid
everywhere else. For example, in the expression
extract($foo, "(\d+)\s+(\d+)\s+(\d+)", $1 + "/" + $2) + ";" + $3
the parser should accept $1 and $2, but the $3 should generate a syntax
error. To do this, I define a scope with one member named in_extract.
in_extract is initialized to 0 when the first entry in the scope stack
is created, then when additional entries are created, it inherits
whatever value the enclosing stack had. When the second comma in the
extract(...) argument list is recognized, in_extract is set to 1 in the
current scope. Finally, a gated semantic predicate using the value of
in_extract is used to enable or disable the recognition of $<num>.
This all seems to work just as I hoped when I'm generating a Java
parser, but when I use the C target, it refuses to accept $1, $2 or $3.
Any clues as to why this might be? A somewhat stripped down version of
the grammar that illustrates the problem follows.
grammar MinExpr;
options {
output = 'AST';
language = 'C';
ASTLabelType = pANTLR3_BASE_TREE;
}
scope ExprScope {
int in_extract;
}
@members {
int exprScopeDepth = 0;
}
rule : expr EOF!;
expr scope ExprScope;
@init {
if (exprScopeDepth++ == 0)
$ExprScope::in_extract = 0;
else
$ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
: alternative (T_OR^ expr)?;
alternative
: value (T_CONCAT^ alternative)?;
value scope ExprScope;
@init {
$ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
: T_VARIABLE
| T_STRING
| function_call
// T_MATCH is only valid inside the 3rd argument to an extract()
call.
| { $ExprScope::in_extract != 0 }?=> T_MATCH
| '('! expr ')'!
;
function_call
: extractFn
;
extractFn scope ExprScope;
@init {
if (exprScopeDepth++ == 0)
$ExprScope::in_extract = 0;
else
$ExprScope::in_extract = $ExprScope[-1]::in_extract;
}
: T_EXTRACT^ '('! expr ','! expr ','!
{
$ExprScope::in_extract = 1;
}
expr ')'!;
WHITESPACE: (' ' | '\t' | '\n')+ { $channel = HIDDEN; };
T_VARIABLE: '$' LETTER (LETTER|DIGIT)*;
T_MATCH: '$' (DIGIT)+;
T_EXTRACT: 'extract';
T_CONCAT: '+';
T_OR: '||';
T_STRING: ('"' NONDQUOTE '"') | ('\'' NONSQUOTE '\'');
fragment NONDQUOTE: (~'"')*;
fragment NONSQUOTE: (~'\'')*;
fragment LETTER: '_' | 'a'..'z' | 'A'..'Z';
fragment DIGIT: '0'..'9';
Thanks for any help! Suggestions about better ways to do this are also
welcome.
Daniel
More information about the antlr-interest
mailing list