[antlr-interest] terminology: "protected"

Thu Jan 12 04:53:01 PST 2006

> There's a big difference 
> indeed between expanding a macro and evaluation an expression.  If you 
> splice a rule (expand a macro), you first construct the text by merging 
> (a purely syntactic operation, no matching/eval/etc.), and the result is 
> another piece of grammar to be matched.  But if you delegate, you first 
> do the matches (in a new local scope), then tokenize the result (bind it 
> to a single token) which is inserted.  So the components are *not* 
> inserted into the syntax.
> 
> In a word, you end up with different token structures depending on 
> whether you splice or delegate.  At least that's the idea.

Just an example:
FOO: "abc" BAR;
BAR: "def";

vs.

FOO: "abc" BAR;
protected BAR: "def";

vs.

FOO: "abc" "def";
(protected) BAR: "def";

All of these will currently result in a single FOO token containing
"abcdef" on input "abcdef". There is no observable difference to the
user, except for non-determinism problems if something else than BAR can
match "def".

> But if you delegate, you first 
> do the matches (in a new local scope), then tokenize the result (bind it 
> to a single token) which is inserted.  So the components are *not* 
> inserted into the syntax.

As far as I know if you "delegate", e.g. do not have a protected rule,
it does not return a Token instead of a String or something - there is
also no (Java) implementation difference, but that should not matter to
the user anyways.

I'm not sure what you're referring to with local scope, but if mean that
a "spliced" rule should be able to access stuff from the scope of the
"calling" rule, then this is -sorry- pure madness. Macros are evil! A
single "," operator that changes the semantic meanings of the access to
scopes is just completely confusing and does not have any reasonable
use. You actually have (protected/internal/sub)rules so you can separate
your code, not so you can mix it all together and have scope accesses
from the other end of the source file.

> > I'd propose to call those rules "internal" - as stated, they cannot be
> > directly accessed from outside of the Lexer (in the rule matching
> > meaning) and "internal" also expresses the Java protected behaviour.
> 
> Internal/external, hidden/exposed, etc. - that's all distinct from the 
> core issue of splicing v. tokenizing, no?

Well, I'm arguing that there is no splicing issue, just an
internal/external issue.

>   Which might be construed more 
> usefully as syntactic v. semantic splicing.  I suppose one might argue 
> that the internal/external distinction is itself an irrelevant 
> implementation detail - what counts is tokenizing.

Internal vs. External has a huge impact, e.g.
FOO: "abc" BAR;
BAR: "def";
BAZ: "def";
does not work, as BAR and BAZ are identical. Add "protected" or
"internal" or whatever to BAR, and it will work. The example is a bit
contrived, but you get the idea.

Regards,
Martin

PS: can you use another word for what you call "tokenize"? I think it's
reserved for the process of splitting an input up into several Tokens,
e.g. what the Lexer does.