[antlr-interest] Problem in grammar (#a, #b, #c, #d, #f are not well recognized)

Jim Idle jimi at temporal-wave.com
Fri Feb 25 12:58:40 PST 2011


Why don't you try the grammar I contributed on antlr download page and see
if that works. The grammar you have here is not going to work as you have
hard coded the tokens in the grammar and ANTLR is generating a lexer that
will fall over on keywords etc. CSS is lot more difficult to parse than
you think.

Microsoft were going to hire me to write some parsers for a project but
they cancelled the project before I was even hired. I had jumped the gun
and written the CSS parser to get a head start on the project, so I just
contributed the grammar. I did not do a lot of testing, but it is pretty
accurate I think.

http://www.antlr.org/grammar/1240941192304/css21.g

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Aurélien Baudet
> Sent: Friday, February 25, 2011 11:43 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Problem in grammar (#a, #b, #c, #d, #f are
> not well recognized)
>
>    Hello,
>
> I'm currently writing an xtext plugin for css. I have a problem and I
> don't find any solution. My grammar works quite well for many css files
> but it fails on that:
>
> .dj_iPad #header #formSearch.disabled {
> 	opacity: 1;
> }
>
>
>
> However, it works for (g instead of h) :
>
> .dj_iPad #header #gormSearch.disabled {
> 	opacity: 1;
> }
>
>
> It works for any character different from a, b, c, e, f...
>
> So I think the parser recognize that as hex character.
>
> Can somebody help me fixing this bug ?
>
> The grammar:
>
> grammar css;
>
> options {
> 	output=AST;
> 	ASTLabelType=CommonTree;
> 	language=Java;
> 	//k=4;
> }
>
> tokens {
> 	IMPORT;
> 	NESTED;
> 	NEST;
> 	RULE;
> 	ATTRIB;
> 	PARENTOF;
> 	PRECEDEDS;
> 	ATTRIBEQUAL;
> 	HASVALUE;
> 	BEGINSWITH;
> 	PSEUDO;
> 	PROPERTY;
> 	FUNCTION;
> 	TAG;
> 	ID;
> 	CLASS;
> 	PERCENTAGE;
> 	UNIT;
> 	PERCENTAGE;
> 	EMS;
> 	EXS;
> 	LENGTH;
> 	ANGLE;
> 	TIME;
> 	FREQ;
> }
>
>
> stylesheet:
> 	charset?
> 	importRule*
> 	namespace*
> 	(ruleset | media | page | font_face | keyframes)+;
>
> charset:
> 	'@charset' STRING ';';
>
> namespace:
> 	'@namespace' IDENT? (STRING|url) ';';
>
> importRule:
> 	'@import' (STRING|url) (medias)? ';';
>
> medias:
> 	IDENT (',' IDENT)*;
>
> keyframes:
> 	'@keyframes' IDENT '{' keyframes_blocks* '}';
>
> keyframes_blocks:
> 	keyframes_selectors block;
>
> keyframes_selectors:
> 	'from' | 'to' | PERCENTAGE (',' 'from' | 'to' | PERCENTAGE)*;
>
> media:
> 	'@media' medias '{' ruleset* '}';
>
> page:
> 	'@page' IDENT? (':' IDENT)? block;
>
> font_face:
> 	'@font-face' block;
>
> ruleset:
> 	selectors block;
>
> selectors:
> 	selector (',' selector)*;
>
> selector:
> 	simple_selector (selectop? simple_selector)*;
>
> simple_selector:
> 	(elem | '*') (attrib | pseudo)?;
>
> block:
> 	'{' properties* ';'? '}';
>
> properties:
> 	declaration (';' declaration)*;
>
> elem:
> 	IDENT
> 	| '#' IDENT
> 	| '.' IDENT;
>
> pseudo:
> 	(':' | '::') IDENT
> 	| (':' | '::') function;
>
> attrib:
> 	'[' IDENT (attribRelate (STRING | IDENT))? ']';
>
> declaration:
> 	IDENT ':' args '!important'?;
>
> args:
> 	expr (','? expr)*;
>
> expr:
> 	('-' | '+')? (NUM | PERCENTAGE | LENGTH | EMS | EXS | ANGLE | TIME
> | FREQ)
> 	| IDENT
> 	//| COLOR
> 	| STRING
> 	| URI
> 	| function;
>
> function:
> 	IDENT '(' args ')';
>
> // TODO: autoriser url(http://...)
> url:
> 	'url(' STRING ')';
>
> attribRelate:
> 	'='
> 	| '~='
> 	| '|=';
>
> selectop:
> 	'>'
> 	| '+';
>
>  URI:
> 	'url(' STRING ')'
> 	| 'url(' ('a'..'~')* ')';
>
>  PERCENTAGE:
> 	NUM '%';
>
>  EMS:
> 	NUM 'em';
>
>  EXS:
> 	NUM 'ex';
>
>  LENGTH:
> 	NUM ('px' | 'cm' | 'mm' | 'in' | 'pt' | 'pc');
>
>  ANGLE:
> 	NUM ('deg' | 'rad' | 'grad');
>
>  TIME:
> 	NUM ('ms' | 's');
>
>  FREQ:
> 	NUM ('khz' | 'hz');
>
>  IDENT:
> 	('_' | 'a'..'z' | 'A'..'Z') ('_' | '-' | 'a'..'z' | 'A'..'Z' |
> '0'..'9')*
> 	| '-' ('_' | 'a'..'z' | 'A'..'Z') ('_' | '-' | 'a'..'z' | 'A'..'Z'
> | '0'..'9')*;
>
>  NUM:
> 	(('0'..'9')* '.')? ('0'..'9')+;
>
>  COLOR:
> 	'#' ('0'..'9' | 'a'..'f' | 'A'..'F')+;
>
> STRING	:
> 			'"' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|'\''|'\\') |
> ~('\\'|'"') )* '"' |
> 			'\'' ( '\\' ('b'|'t'|'n'|'f'|'r'|'"'|'\''|'\\') |
> ~('\\'|'\'') )* '\''
> 		;
>
> // Single-line comments
> SL_COMMENT
> 	:	'//'
> 		(~('\n'|'\r'))* ('\n'|'\r'('\n')?)
> 		{$channel=HIDDEN;}
> 	;
>
> // multiple-line comments
> COMMENT
> 	:	'/*' .* '*/' { $channel = HIDDEN; }
> 	;
>
> // Whitespace -- ignored
> WS	: ( ' ' | '\t' | '\r' | '\n' | '\f' )+ { $channel = HIDDEN; }
> 	;
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list