[antlr-interest] Match anything until a specific phrase

Fri Jan 8 10:09:09 PST 2010

Why don't you just remove the header before sending it to the lexer? Or write a function/method to do input.consume() until you find 'P' then check for 'Page', stop consuming if found, carry on consuming if not. Trigger the method as appropriate in action code for tokens or at lexer start up.

I would remove the 'literals' from your parser and make real lexer rules. Remember that the lexer runs, then the parser runs, you cannot direct the lexer from the parser.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of fridi
> Sent: Friday, January 08, 2010 7:15 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Match anything until a specific phrase
> 
> Hello all,
> maybe someone can help me to get this done with ANTLR 3.2
> 
> My file has a header starting with 'test', some comments and then
> several blocks named 'Page 1',  'Page 2' etc. with integers, i.e.
> 
> test This is a comment and
>         we are not interested in.
>             Today is friday.
> 
> Page 1:
>     123
>     456
>     789
> 
> 
> I want to have a rule that consumes everything of the header until the
> word 'Page'.
> 'Page' should not be consumed by the header, it be consumed by another
> rule.
> 
> So I tried the following:
> 
> grammar TestNot;
> 
> options {
>    language = Java;
> }
> 
> rule :
>    file;
> 
> file :
>    header PAGE INT ':' INT+ EOF;
> 
> header :
>    'test' ~PAGE;
> 
> PAGE :
>    'Page';
> 
> INT :
>    DIGIT+;
> 
> fragment
> DIGIT :
>    '0'..'9';
> 
> 
> Any idea? Thanks in advance.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address