[antlr-interest] Get results of multible tokens

Hugo Picado hugo.pcd at gmail.com
Wed Sep 2 15:05:30 PDT 2009


Hi,

One fast approach is to divide to conquer:

line
 : property subtokenlist DPOINT attribute
 ;
property
 : TOKEN { System.out.println ("Property: " + $TOKEN.text); }
 ;
subtokenlist
 : (SEMI TOKEN { System.out.println("Subtoken: " + $TOKEN.text); } )*
 ;
attribute
 : TOKEN { System.out.println ("Attribute: " + $TOKEN.text); }
 ;

This also eliminates the need for having the SUBTOKEN rule and solves the
semicolon problem.
 I didn't try this because it is not possible for me right now so I don't
know if it is actually working, but the idea is there :)

Good luck,
Hugo.


On Wed, Sep 2, 2009 at 10:13 PM, Andreas Volz <lists at brachttal.net> wrote:

> Hello,
>
> I have this grammar file:
>
> grammar VCard;
>
> @members {
>    public static void main(String[] args) throws Exception {
>        VCardLexer lex = new VCardLexer(new ANTLRFileStream(args[0]));
>        CommonTokenStream tokens = new CommonTokenStream(lex);
>
>        VCardParser parser = new VCardParser(tokens);
>
>        try {
>            parser.line();
>        } catch (RecognitionException e)  {
>            e.printStackTrace();
>        }
>    }
> }
>
> line
>        : property=TOKEN subtoken=SUBTOKEN* DPOINT attribute=TOKEN
>        {
>                System.out.println ("Property: " + $property.text);
>                System.out.println ("Attribute: " + $attribute.text);
>                System.out.println ("Subtoken: " + $subtoken.text);
>
>        }
>        ;
>
> TOKEN
>        : (ALPHA | DIGIT)+
>        ;
>
> SUBTOKEN
>        : SEMI TOKEN
>        ;
>
> WS
>        : ('\n' | ' ' | '\t')* {$channel=HIDDEN;}
>        ;
>
> fragment DIGIT
>        : '0'..'9'
>        ;
>
> fragment ALPHA
>        : 'a'..'z' | 'A'..'Z'
>        ;
>
> DPOINT
>        : ':'
>        ;
>
> SEMI
>        : ';'
>        ;
>
>
> And this input:
>
> a;b;c;2:3a3bcde
>
> This is the output:
>
> Property: a
> Attribute: 3a3bcde
> Subtoken: ;2
>
> What I like to get is:
>
> Property: a
> Subtoken: b
> Subtoken: c
> Subtoken: 2
> Attribute: 3a3bcde
>
> I couldn't find in the docs how to match multiple tokens that I get
> from a * or + parser.
>
> A second question is how to not include the ';' in the match.
>
> I tried it for some time now, but I find no way. Could someone give me
> an hint.
>
> regards
> Andreas
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090902/cea8f4a6/attachment.html 


More information about the antlr-interest mailing list