[antlr-interest] The unary not (~) vs. W3C EBNF dash operator

Mon Oct 8 13:36:52 PDT 2007

The W3C spec uses a bastardized extension of BNF that
supports pattern matching and comparison of matched
patterns (the '-' operator that you are concerned
with).  To process that requires re-lexing of input;
ANTLR does not support that, nor does any other sane
lexer generator.  What you need to do instead is to
match a string, then check in action code if it
satisfies the "A - B" constraint and type the
resulting token accordingly.

XML was not constructed to obey principles of formal
language theory and suffers as a result.  Some of us
vacillate between considering it a disease and
thinking that it has become sufficiently widespread to
be useful despite the ugliness of things built on top
of it.

--Loring

--- Andreas Ravnestad <andreas.ravnestad at gmail.com>
wrote:

> The W3C uses an operator in their EBNFs designated
> by a dash (-), and
> it is defined as follows (see [1]): A - B matches
> any string that
> matches A but does not match B.
> 
> For now, I have simply replaced the dashes with
> tilde in the ANTLR
> grammar, however this is not semantically correct.
> Is there a
> semantically equivalent operator in ANTLR, or is it
> necessary to
> rewrite grammar rules that uses the dash operator?
> 
> - Andreas Ravnestad
> 
> [1] http://www.w3.org/TR/REC-xml/#sec-notation
> 

      ____________________________________________________________________________________
Shape Yahoo! in your own image.  Join our Network Research Panel today!   http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7