[antlr-interest] How to specify ‘any non-control symbol’?

Hendrik Maryns qwizv9b02 at sneakemail.com
Thu Oct 30 10:44:40 PDT 2008


Jim Idle schreef:
> On Thu, 2008-10-30 at 16:10 +0100, Hendrik Maryns wrote:
>> Hendrik Maryns schreef:
>> > Johannes Luber schreef:
> 
>> Why doesn’t
>>
>> LABEL : ~(WHITESPACE | '(' | ')')+ ;
>>
> 
> Your WHITESPACE rule probably defines a sequence rather than a set.

Nope, but it does declare $channel = HIDDEN, maybe that is the problem?

WHITESPACE : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;} ;

> Create a:
> 
> fragment WSCHARS : (' ' | '\t' | '\f');
> 
> Then:
> 
> LABEL : ( ~ (WSCHARS | '(' | ')' ) )+;
> 
> Will probably work, though you are probably gong to create huge lexer
> tables (which we are hoping to do something about shortly).

This does not work either: it parses O)) successfully, but I do not
understand why.

Even substituting the whitespace characters explicitly doesn’t work.

Similarly,

LABEL :  ( '!'..'\'' | '*'..'\uffff' )+ ;

accepts the input O)) as well.  This seems like a bug to me.  Oh, when
talking about bugs: antlr doesn’t allow characters like ’ and other
Unicode in a grammar, not even in a comment!

H.
-- 
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 257 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20081030/24a98d69/attachment.bin 


More information about the antlr-interest mailing list