[antlr-interest] Again VCard Parser

Andreas Volz lists at brachttal.net
Fri Sep 25 14:48:42 PDT 2009


Hello,

I still work on my VCard Parser. This is the input data:

BEGIN:VCARD
VERSION:3.0
N:Mustermann;Max
FN:Max Mustermann
ORG:Wikipedia
URL:http://de.wikipedia.org/
EMAIL;TYPE=INTERNET:max.mustermann at example.org
TEL;TYPE=voice,work,pref:+49 1234 56788
ADR;TYPE=intl,work,postal,parcel:;;Musterstraße 1;Musterstadt;;12345;Germany
END:VCARD

This is my grammar:

vcard
	: 'BEGIN:VCARD' vcardcontent 'END:VCARD' ANY?
	;

vcardcontent
	: (property* attribute)*
	;
	
property
	: TOKEN { printf("Property: \%s\n", $TOKEN.text->chars);}
	;
	
attribute
	: ATTRIBUTE { printf("Attribute: \%s\n", $ATTRIBUTE.text->chars); }
	;
	
TOKEN
	: (ALPHA | DIGIT)+
	;

ATTRIBUTE
	: ':' ( options {greedy=false;} : . )* NEWLINE
	;

fragment DIGIT  	
	: '0'..'9'
	;
	
fragment ALPHA
	: 'a'..'z' | 'A'..'Z' |'@'|'.'| ' ' | '='| ','
	;

fragment NEWLINE
	: '\r' '\n'? | '\n' //{ $channel = HIDDEN; }
	;
// Always make this the very last lexer rule
ANY	
	: . { SKIP(); }
	;

This is the output;

Property: VERSION
Attribute: :3.0

Property: N
Attribute: :Mustermann;Max

Property: FN
Attribute: :Max Mustermann

Property: ORG
Attribute: :Wikipedia

Property: URL
Attribute: :http://de.wikipedia.org/

Property: EMAIL
Property: TYPE=INTERNET
Attribute: :max.mustermann at example.org

Property: TEL
Property: TYPE=voice,work,pref
Attribute: :+49 1234 56788

Property: ADR
Property: TYPE=intl,work,postal,parcel
Attribute: :;;Musterstraße 1;Musterstadt;;12345;Germany


This is yet really good. But I like to split also the Attribute with ';'

All what I've tried leads to errors while parsing. Is it maybe possible to 
take the return of the greedy=false ATTRIBUTE and split it again with another
parser?

Any ideas to get this done?

regards
	Andreas


More information about the antlr-interest mailing list