[antlr-interest] finally stumbled on a solution but i still dont grok it !

Ymo ymo.mail at gmail.com
Tue Aug 26 21:25:40 PDT 2008


Hi All.

I was trying to match block of delimited text. the  token CODE was always
taking precedence and matching all tokens that came before it for no obvious
reason. Right now when i put  parenthesis and the options (i even tried
passing k .. it seems no bearing what k i choose) around the LG_BLOCK rule
then evrything seems to work fine. Any ideas why ?

Thanks Matt for all your help !

input:
<%a>
<@b>
<ccc:d>
<aaaaaaaaa:bbbbbbbbbbbbbbb>
<%-comment-%>

grammar:

all     :    ( t1 | t2 | t3 | code | text)* ;
t1      :    T1 RG;
t2    :    T2 RG;
t3    :    T3 RG;
code    :    CODE;
text    :    TEXT;

//LEXER
LG : '<';
RG : '>';

LG_BLOCK:
    ( options {k=2;} : // <-- why do i need this ?? and why does k=1 still
works ?
        (LG '%-') => COMMENT { $type=COMMENT;} |
        (LG 'ccc:d') => T3 { $type=T3;} |
        (LG '%a') => T1 { $type=T1;} |
        (LG '@b') => T2 { $type=T2;} |
        //uncommenting the below line makes lexes '<%a' as CODE !!!
        //it basically overrides evrything for no reason !
        (LG ) => CODE { $type=CODE;} |
        ( TEXT {$type=TEXT;})
    );

fragment T1 : LG '%a';
fragment T2 : LG '@b';
fragment T3 : LG 'ccc:d';

fragment TEXT: ( ~(LG|RG) )+;

fragment
CODE :
   LG (options {k=2;greedy=false;} : . )+  RG{
   };

fragment COMMENT :
   LG '%-' ( options {k=3;greedy=false;} : . )* '-%' RG {
      $channel=HIDDEN;
   };
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080827/a9031188/attachment.html 


More information about the antlr-interest mailing list