[antlr-interest] Context-sensitive lexer

Fri Jun 17 06:01:16 PDT 2011

I should have said that the infinite loop can be triggered by giving an  
input that doesn't begin by 'TITLE', so for example the input 'TEST' will  
throw the parser into an infinite loop. With the "typical" input that I  
gave in the original post, the parser instead accepts all input after the  
title section, which I also find very strange. Grateful for any help.

Best Regards,
Jonas

On Jun 17, 2011 2:23pm, "Strobl, Robert"  
<Robert.Strobl at student.hpi.uni-potsdam.de> wrote:
> Have you tried to enable the backtracking?

> Best regards,

> Robert

> Am 17.06.2011 um 14:15 schrieb Jonas:

> > Hi,

> >

> > I'm developing a parser for a file format where context is very

> > important. I'm looking to

> > 1) understand why my ANTLR parser gets into infinite loops

> > 2) find out if there is any better way to implement context

> > sensitivity than what I am doing with semantic predicates.

> >

> > A typical beginning of a file looks like this:

> > TITLE

> > some title text

> >

> > SECTION1

> > a=b*c

> > END

> >

> > SECTION2

> > ...

> >

> > SECTION3

> > ...

> >

> > The syntax differs from section to section; the 'TITLE' section is

> > terminated by the newline after the title text line, while other

> > sections can eg use single quote string literals and be terminated

> > by a keyword like 'END'. Here is a sample grammar, that gets into an

> > infinite loop:

> >

> > grammar test;

> >

> > options {

> > output=AST;

> > }

> >

> > @lexer::members {

> > static final int STATE_AT_BEGINNING = 0;

> > static final int STATE_IN_TITLE = 1;

> > static final int STATE_AFTER_TITLE = 2;

> > int lexerState = STATE_AT_BEGINNING;

> > }

> >

> > file : title;

> >

> > title : BEGIN_TITLE TITLE_TEXT END_TITLE;

> >

> > BEGIN_TITLE

> > : {(lexerState == STATE_AT_BEGINNING)}? 'TITLE' WS_NL

> > {lexerState=STATE_IN_TITLE;}

> > ;

> >

> > TITLE_TEXT

> > : {lexerState == STATE_IN_TITLE}? TEXT

> > ;

> >

> > END_TITLE

> > : {lexerState == STATE_IN_TITLE}? NL {lexerState=STATE_AFTER_TITLE;}

> > ;

> >

> > BLANK_ROW

> > : {!(lexerState == STATE_IN_TITLE)}? WS_NL

> > ;

> >

> > REMARK : {!(lexerState == STATE_IN_TITLE)}? 'REMA' .* NL

> > ;

> >

> > fragment

> > WS_NL : (' ' | '\t')* NL;

> >

> > fragment

> > NL : '\r'? '\n';

> >

> > fragment

> > TEXT : (~('\r' | '\n'))*;

> >

> > Best Regards,

> > Jonas

> >

> > List: http://www.antlr.org/mailman/listinfo/antlr-interest

> > Unsubscribe:  
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address

> List: http://www.antlr.org/mailman/listinfo/antlr-interest

> Unsubscribe:  
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address