[antlr-interest] Match a repetition of characters

Jim Idle jimi at temporal-wave.com
Fri Jun 24 10:40:33 PDT 2011


Don't try to do this in the lexer or parser, you will just get difficult
to interpret syntax errors. You want to generate semantic errors with more
context. However, you want to do something like this if you must
distinguish 4 or more from singles:

fragment UNDERSCORES;
UNDERSCORE: '_'
             (    ('___')=> '_'+ {$type = UNDERSCORES;}
                 |
              )
;

But this:

UNDERSCORES: '_'+;

Then

prule: UNDERSCORES { if (countem($UNDERSCORES) < 4) { semantic error } ;


Is probably a better approach.

Jim


> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Douglas Godfrey
> Sent: Friday, June 24, 2011 8:39 AM
> To: Robin
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Match a repetition of characters
>
> underline returns [char symbol]
>  : underlineAtom {$symbol=$underlineAtom.text} {$symbol}+ LINE_BREAK  ;
>
> underlineAtom
>  : ( UNDERSCORE UNDERSCORE UNDERSCORE UNDERSCORE+ )  | ( STAR STAR STAR
> STAR+ )  | ( PIPE PIPE PIPE PIPE+ )  | ( BACKTICK BACKTICK BACKTICK
> BACKTICK+ )  | ( COLUMN COLUMN COLUMN COLUMN+ )  | ( SPECIAL_CHAR
> SPECIAL_CHAR SPECIAL_CHAR SPECIAL_CHAR+ )  ;
>
>
>
> On Fri, Jun 24, 2011 at 6:02 AM, Robin <diabeteman at gmail.com> wrote:
>
> > Hello everyone,
> >
> > I'm trying to write a rule that matches the repetition (4 or more) of
> > the same special character
> >
> > For example:
> >
> > "^^^^^^^^^^^^^^^^^^^^"
> >
> > or
> >
> > "________________"
> >
> > I have these lexer rules :
> >
> > UNDERSCORE : '_';
> > BACKTICK : '`';
> > STAR : '*';
> > PIPE : '|';
> > COLUMN : ':';
> > SPECIAL_CHAR :
> >
> >
> ('!'|'"'|'#'|'$'|'%'|'&'|'\''|'('|')'|'+'|','|'.'|'/'|';'|'<'|'='|'>'|
> > '?'|'@'|'['|'\\'|']'|'^'|'{'|'}'|'~');
> > LINE_BREAK : '\u000C'?'\r'?'\n';
> >
> > And I'd like to write a parser rule named "underline" that only
> > matches if this is a repetition of *the same character* and that
> > returns this character. So that enclosing rules can use it.
> >
> > For now I wrote this:
> >
> > underline returns [char symbol]
> >  : underlineAtom {$symbol=$underlineAtom.text} {$symbol}+ LINE_BREAK
> > ;
> >
> > underlineAtom
> >  : UNDERSCORE
> >  | STAR
> >  | PIPE
> >  | BACKTICK
> >  | COLUMN
> >  | SPECIAL_CHAR
> >  ;
> >
> > But my grammar does not compile...
> >
> > Can someone help me on this ? :)
> >
> > Thanks
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list