[antlr-interest] is it a bug, or am I too stupid?

Jim Idle jimi at temporal-wave.com
Sat Apr 28 15:12:03 PDT 2007


Try this:

 

grammar T;

 

tokens

{

BULLET;

}

 

test : (  s=SEPARATOR { System.out.println("Sep is " + $s.text); }

                                | b=BULLET

                   )+

                 ;

                 

SEPARATOR : '*' 

                                                (

                                                                  '***'
( ~('\n' | '\r') )* 

                                                                | {
$type = BULLET ; }

                                                )

                                ;

 

WS : '\r'? '\n' | ' ' | '\t' { $channel = HIDDEN; } ;

 

This assumes that blanks are not allowed between the '*', but if they
are then you easily make a fragment rule for WS and allow WS*, though
you might need a predicate then. The code above will cause ANTLR to
falsely warn you that there is no lexer token BULLET, but you can ignore
that.

 

Jim

 

 

From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Markus Kuhla
Sent: Saturday, April 28, 2007 11:16 AM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] is it a bug, or am I too stupid?

 

Hi,

thanks for your reply! You're absolutely right concerning the newlines?
I fixed it, but it still won't work. The Parser recognizes the separator
**** as a bullet-hierarchie and this fails, because after **** is no
text (text is required after the bullet *'s). But if the parser gets an
error, why is there no backtracking? I actually told him to do this at
the decision (separator | bullet-hierarchie). I cannot enable
backtracking for the whole grammar, because it's really big. Please can
you help me, how to enable backtrack for this decision, or what I'm
doing wrong with the grammar:

grammar ambg;

ASTERISK          :    '*';
NEWLINE           :    (('\r')? '\n' | '\r');
BLANKS            :    (' ' | '\t')+;
ELSE              :    .;


page              :    (page_element)+;

page_element options {backtrack=true; memoize=true;}
                  :    (separator | bullet_hierarchie);

separator         :    (BLANKS)? ASTERISK ASTERISK ASTERISK ASTERISK
(BLANKS)? NEWLINE;

bullet_hierarchie :    (bullet_item)+;

bullet_item       :    ASTERISK  content;

content           :    ASTERISK  content
                  |    text;

text              :    ~(NEWLINE | ASTERISK)+ NEWLINE;
    
newline           :    NEWLINE | EOF;



the input is:
*1
**2
*** 3
****      4
*****       5
******6
**7
*8
****



Thank you very much!!!!
Best, Markus

Miguel Ping schrieb: 

Hi there,

>From what I can see, you are requiring a newline in these 3 rules:

page_element      :    (separator | bullet_hierarchie) newline;
separator         :    (BLANKS)? ASTERISK ASTERISK ASTERISK ASTERISK
(BLANKS)? NEWLINE; 
bullet_hierarchie :    (bullet_item  newline)+;

So when bullet_hierarchie ends, you require a newline before exiting
rule page_element. Try removing newline from rules separator and
bullet_hierarchie, so that page_element handles newlines and the other
rules handle only what matters. 

Btw, i take it you are using antlr v3, if so, you don't need to specify
lookahead  with the k=6 statement in the options

Happy parsing,

Miguel Ping

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070428/7d7f4732/attachment.html 


More information about the antlr-interest mailing list