[antlr-interest] Context-sensitive lexer

Strobl, Robert Robert.Strobl at student.hpi.uni-potsdam.de
Fri Jun 17 05:23:05 PDT 2011


Have you tried to enable the backtracking?

Best regards,
Robert

Am 17.06.2011 um 14:15 schrieb Jonas:

> Hi,
> 
> I'm developing a parser for a file format where context is very
> important. I'm looking to
> 1) understand why my ANTLR parser gets into infinite loops
> 2) find out if there is any better way to implement context
> sensitivity than what I am doing with semantic predicates.
> 
> A typical beginning of a file looks like this:
> TITLE
> some title text
> 
> SECTION1
> a=b*c
> END
> 
> SECTION2
> ...
> 
> SECTION3
> ...
> 
> The syntax differs from section to section; the 'TITLE' section is
> terminated by the newline after the title text line, while other
> sections can e.g. use single quote string literals and be terminated
> by a keyword like 'END'. Here is a sample grammar, that gets into an
> infinite loop:
> 
> grammar test;
> 
> options {
>  output=AST;
> }
> 
> @lexer::members {
>  static final int STATE_AT_BEGINNING = 0;
>  static final int STATE_IN_TITLE = 1;
>  static final int STATE_AFTER_TITLE = 2;
>  int lexerState = STATE_AT_BEGINNING;
> }
> 
> file 	:	title;
> 
> title	:	BEGIN_TITLE TITLE_TEXT END_TITLE;
> 
> BEGIN_TITLE
> 	: {(lexerState == STATE_AT_BEGINNING)}? 'TITLE' WS_NL
> {lexerState=STATE_IN_TITLE;}
> 	;
> 	
> TITLE_TEXT
> 	: {lexerState == STATE_IN_TITLE}? TEXT
> 	;
> 	
> END_TITLE
> 	: {lexerState == STATE_IN_TITLE}? NL {lexerState=STATE_AFTER_TITLE;}
> 	;
> 	
> BLANK_ROW
> 	: {!(lexerState == STATE_IN_TITLE)}? WS_NL
> 	;
> 	
> REMARK	: {!(lexerState == STATE_IN_TITLE)}? 'REMA' .* NL
> 	;
> 	
> fragment
> WS_NL	:	(' ' | '\t')* NL;
> 
> fragment
> NL	:	'\r'? '\n';
> 
> fragment
> TEXT	:	(~('\r' | '\n'))*;
> 
> Best Regards,
> Jonas
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list