[antlr-interest] Antlr grammar for xml like grammar
Matt Palmer
mattpalms at gmail.com
Tue Aug 26 11:56:08 PDT 2008
Hi Ymo,
again, I'm not an expert at this, but this grammar parses your input text:
grammar T;
tokens {
LG='\u00ab';
RG='\u00bb';
}
// parser
all : ( pi | code | text | comment )*;
pi : TOK_PI;
comment : TOK_COMMENT;
code : TOK_CODE;
text : TOK_TEXT;
// LEXER
TOK_PI : LG '@' RG;
TOK_COMMENT
: TOK_LCOMMENT ( options {greedy=false;} : . )* TOK_RCOMMENT;
TOK_TEXT : ( ~(LG|RG) )+;
TOK_CODE : LG ~'@' ( options {greedy=false;} : . )* RG;
fragment TOK_LCOMMENT
: LG '%--';
fragment TOK_RCOMMENT
: '--%' RG;
It's not quite right (the ~'@' in TOK_CODE is only the first (hacky) way I
could make the parser distinguish between TOK_PI and TOK_CODE. If you take
this out, it still works, but will recognise TOK_PI as TOK_CODE. Adding a
syntactic predicate ( LG '@' RG )=> to TOK_PI did not help the issue. So
this isn't a solution, but I hope it moves you towards one.
I've added some parser rules so you can see the parse tree in antlr.
Matt
On Tue, Aug 26, 2008 at 5:40 PM, Ymo <ymo.mail at gmail.com> wrote:
> Hi matt i apreciate you taking a look at this.
>
> I pasted the reduced input & grammar:
>
> The first line is never recognized as TOK_PI. It is always seen as
> TOK_CODE.
>
> Input is :
> «@»
> «fgdsfgs»
> «%-- comment --%»
>
> then i reduced the grammar to this
>
> tokens {
> LG='\u00ab';
> RG='\u00bb';
> }
>
>
> //LEXER
> TOK_PI : LG '@';
> TOK_LCOMMENT : '%-';
> TOK_RCOMMENT : '-%';
>
> TOK_BLOCK : { tagMode==false }? =>
> (LG TOK_LCOMMENT) => TOK_COMMENT { $type=TOK_COMMENT; } |
> (TOK_PI) => TOK_PI { $type=TOK_PI; } |
> (LG ) => TOK_CODE { $type=TOK_CODE; } |
> TOK_TEXT { $type=TOK_TEXT; } {
> };
>
> fragment
> TOK_TEXT :
> ( ~(LG|RG) )+ {
> };
>
> fragment
> TOK_CODE :
> LG ( options {k=2;greedy=false;} : . )* RG {
> };
>
> fragment
> TOK_COMMENT :
> LG TOK_LCOMMENT ( options {k=3;greedy=false;} : . )* TOK_RCOMMENT RG {
> $channel=HIDDEN;
> };
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080826/c4e9b05f/attachment.html
More information about the antlr-interest
mailing list