[antlr-interest] Parsing with inverse matches

David-Sarah Hopwood david-sarah at jacaranda.org
Sun Nov 22 23:45:04 PST 2009


Vipul Delwadia wrote:
> Hi,
>
> Suppose I have a very simple grammar:
>
> line:	x;
>
> x	:	STRING+;
>
> fragment BACKSLASH
> 	:	'\\';
>
> NOTA:	BACKSLASH A;
>
> A	:	'a';
>
> STRING
> 	:	(~(A)|NOTA)+;
>
> Now I want x to be able to match any sequence which doesn't have "a"
> in it, including sequences which have "\a".

A and NOTA should be fragment rules, but there are other problems,
for example (~(A)|NOTA) is ambiguous because ~(A) includes '\\'.
This should work (untested):

STRING : (~('a'|'\\') | ('\\' .))+ '\\'?;

Note that this will not allow a double-backslash followed by 'a'.
(In most languages, that would be an escaped backslash, so it
shouldn't be allowed.) You may or may not want to allow the
optional unterminated backslash ('\\'?) at the end.

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 292 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20091123/184766de/attachment.bin 


More information about the antlr-interest mailing list