[antlr-interest] Again VCard Parser
Andreas Volz
lists at brachttal.net
Fri Sep 25 14:48:42 PDT 2009
Hello,
I still work on my VCard Parser. This is the input data:
BEGIN:VCARD
VERSION:3.0
N:Mustermann;Max
FN:Max Mustermann
ORG:Wikipedia
URL:http://de.wikipedia.org/
EMAIL;TYPE=INTERNET:max.mustermann at example.org
TEL;TYPE=voice,work,pref:+49 1234 56788
ADR;TYPE=intl,work,postal,parcel:;;Musterstraße 1;Musterstadt;;12345;Germany
END:VCARD
This is my grammar:
vcard
: 'BEGIN:VCARD' vcardcontent 'END:VCARD' ANY?
;
vcardcontent
: (property* attribute)*
;
property
: TOKEN { printf("Property: \%s\n", $TOKEN.text->chars);}
;
attribute
: ATTRIBUTE { printf("Attribute: \%s\n", $ATTRIBUTE.text->chars); }
;
TOKEN
: (ALPHA | DIGIT)+
;
ATTRIBUTE
: ':' ( options {greedy=false;} : . )* NEWLINE
;
fragment DIGIT
: '0'..'9'
;
fragment ALPHA
: 'a'..'z' | 'A'..'Z' |'@'|'.'| ' ' | '='| ','
;
fragment NEWLINE
: '\r' '\n'? | '\n' //{ $channel = HIDDEN; }
;
// Always make this the very last lexer rule
ANY
: . { SKIP(); }
;
This is the output;
Property: VERSION
Attribute: :3.0
Property: N
Attribute: :Mustermann;Max
Property: FN
Attribute: :Max Mustermann
Property: ORG
Attribute: :Wikipedia
Property: URL
Attribute: :http://de.wikipedia.org/
Property: EMAIL
Property: TYPE=INTERNET
Attribute: :max.mustermann at example.org
Property: TEL
Property: TYPE=voice,work,pref
Attribute: :+49 1234 56788
Property: ADR
Property: TYPE=intl,work,postal,parcel
Attribute: :;;Musterstraße 1;Musterstadt;;12345;Germany
This is yet really good. But I like to split also the Attribute with ';'
All what I've tried leads to errors while parsing. Is it maybe possible to
take the return of the greedy=false ATTRIBUTE and split it again with another
parser?
Any ideas to get this done?
regards
Andreas
More information about the antlr-interest
mailing list