[antlr-interest] please help on lexer rules antlr v3

Michiel Vermandel Michiel_Vermandel at axi.be
Fri Sep 29 00:34:02 PDT 2006


Hi John,

Thanks for your remarks.
I made the changes you suggested, though, still the "-- axi.reject" string 
is consumed as an SL_COMMENT...
For the moment I kept the name SL_COMMENT just to make it easier to track 
the changed.
Can you please tell why this modified grammar acts in just the same way as 
the previous?

Thanks!

------------------------------modified grammar----------------------------
grammar TestParser;
options {k=2; backtrack=true; memoize=true;}

statement: (directive )+ EOF;

directive: SL_COMMENT;


LF      :       '\n' { channel=99; };

CRLF    :       '\r' ('\n')? { channel=99; };

TAB     :       '\t' { channel=99; };

SPACE   :       ' ' { channel=99; };

fragment
ANYTHING_2_EOL: (~('\n'|'\r' ))* ('\n'|'\r'('\n')?);

//fragment
//DIRECTIVE: ('axi.locate' | 'axi.reject');

// Single-line comment
SL_COMMENT:  '--' ( 'axi.locate' | 'axi.reject' | ( ANYTHING_2_EOL { 
channel=99; }) );

WS      :       ( TAB | SPACE | CRLF | LF )+ { channel=99; };







Pardon me for butting in...

Your grammar is incorrect but also exposes a bug in ANTLR (IMHO). See 
below.

>...snipped...
>My test string is:
>
>-- axi.reject
>
>followed by a single CRLF
>
>Any idea why the   -- axi.reject  is still consumed as  a comment (on 
>channel 99) ?

this in not what i see...

>...snipped...
>---------------------------- grammar --------------------------
>
>grammar TestParser;
>options {k=2; backtrack=true; memoize=true;}
>
>statement: (directive )+ EOF;
>
>directive: DIRECTIVE;
>
>
>LF      :       '\n' { channel=99; };
>
>CRLF    :       '\r' ('\n')? { channel=99; };
>
>TAB     :       '\t' { channel=99; };
>
>SPACE   :       ' ' { channel=99; };
>
>fragment
>ANYTHING_2_EOL: (~('\n'|'\r' ))* ('\n'|'\r'('\n')?);
>
>fragment
>DIRECTIVE: (SPACE | TAB )* ('axi.locate' | 'axi.reject');
>
>// Single-line comment
>SL_COMMENT:  '--' ( DIRECTIVE | ( ANYTHING_2_EOL { channel=99;} ) );
>
>WS      :       ( TAB | SPACE | CRLF | LF )+ { channel=99; };
>
>
>---------------------------------------------------------------------------

actually, when i run your grammar, i get a parsing error indicating no 
match
was found for this text: -- axi.reject

in your directive rule you use the token DIRECTIVE. yet the lexer rule
DIRECTIVE is marked as a fragment (e.g. not visible to the parser).

Therefore the directive parsing rule will be never matched (because a
DIRECTIVE is never emitted by the lexer, you have told antlr that it is a
fragment of some other token).

change the directive rule to be:

directive: SL_COMMENT; /*and pick better name*/

and things work much better... ;-)



Question for Dr. Parr - shouldn't Antlr disallow parser references to 
lexer
fragments?



Hope this helps...
   -jbb

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20060929/9dca8d3a/attachment.html 


More information about the antlr-interest mailing list