[antlr-interest] Suggestion: syntactic sugar for generateAmbigWarnings = false;

uprightness_of_character andrei at metalanguage.com
Sat Jun 14 00:30:01 PDT 2003


I have a number of rules in which the first rule that matches is the 
one that's needed, and there's a final match for "everything else". In 
a nutshell => generateAmbigWarnings = false. 

For example, consider you write a C-style preprocessor, and you 
generate the code to get a macro argument.

The idea is, you need to read up to the first comma, but if there are 
parentheses, curly braces, or square brackets, you will pair them 
properly (commas are allowed inside). For example, "1 (2, a) 3" would 
be a proper argument. 

I came up with the following rule:

fetch_macro_argument
    : 
    (
        balanced_pars 
        | balanced_curlz
        | balanced_squares
        | tok:~(COMMA | LPAR | LCURL | LSQUARE)
    )*
    ;
    
So, a macro argument can consist of a mixture of the following items - 
any set of balanced parens, any set of balanced curly braces, any set 
of balanced square brackets, or anything else that's not a comma.

Now I could have written this as:

fetch_macro_argument
    : 
    (options { generateAmbigWarnings = false; } 
        : balanced_pars 
        | balanced_curlz
        | balanced_squares
        | tok:~COMMA)
    )*
    ;

(Let me make a remark en passent, the most efficient code is actually 
generated for the following:

fetch_macro_argument
    : 
    (options { generateAmbigWarnings = false; } 
        : balanced_pars 
        | balanced_curlz
        | balanced_squares
        | { LA(1) != COMMA}? tok:.)
    )*
    ;

But that's subject of another discussion.)

So anyway, I have two variants to choose from, and they are both more 
verbose than I'd like. I'd like to propose defining the operator "||" 
(as opposed to "|") to combine "short-circuit" rules - rules that obey 
the "first wins" policy.

The notation is nicely consistent with the semantics of the "||" 
operator, where the first condition that's true stops evaluation. 
Also, the precedence would be lower that that of "||" - and that makes 
sense for the grammar, because most of the time you want to match some 
discriminating rules, followed by a more general one.

With that hypothetical operator I could write my rule simply as:

fetch_macro_argument
    : 
    (
        balanced_pars 
        | balanced_curlz
        | balanced_squares
        || tok:~COMMA
    )*
    ;

Whaddaya think?


Andrei


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list