[antlr-interest] advice on creating a grammar for a given DSL example

Jim Idle jimi at temporal-wave.com
Mon Aug 16 11:13:01 PDT 2010


Remove your tokens section and place those keywords as real lexer rules that
are listed before the ID rule.
Remove any 'literals' in your parser and make these real lexer rules and
tokens.
Change your ANY rule to just: ANY :. { issue error message; skip(); };
Use {skip();} rather than HIDDEN unless you need the whitespace later for
some reason;

Then you have to look at where you are putting your NEWLINEs in the grammar.
It looks like you have too many specified and are expecting NEWLINE NEWLINE
at some point. You should try skipping newlines and see how ambiguous your
grammar is. If this is your own language, don't fall for the "I don't want
delimiters in my language" myth as it makes giving out informative error
messages nigh on impossible.

But more than all this, you need to run your grammar in the ANTLR works
debugger, which will allow you to trace out what is wrong with your grammar
specification.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of aryoo
> Sent: Monday, August 16, 2010 10:34 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] advice on creating a grammar for a given DSL
> example
> 
> Hello,
> 
> I am new to ANTLR, I own the books and was able to reproduce the hello-
> world and the book examples (Python style) successfully.
> 
> What I would love to do now is to
> 1) create my own grammar that could parse the file below.
> 2) generate some Python code that corresponds to the 'objects' and some
> XML code that corresponds to the 'views'
> 
> 
> My first attempt for part 1 is summarized below (grammar + error
messages).
> 
> Any advice on what I did wrong or similar examples would be greatly
> appreciated.
> 
> Regards,
> Arye.
> 
> 
> 
> ****************************************example of file to
> parse********begin module {
>     name:"name_of_module"
>     version:1.0
> }
> 
> object {
>     module:"name_of_module"
>     name:"name_of_object1"
>     column {
>         name:"name_of_column1"
>         type:"type_of_col1"
>     }
>     column {
>         name:"name_of_column2"
>         type:"type_of_col2"
>     }
> }
> 
> 
> view {
>     object:"name_of_object1"
>     name:"name_of_view"
>     type:"type_of_view"
>     field {
>         column:"name_of_column1"
>     }
>     field {
>         column:"name_of_column2"
>     }
> }
> ****************************************example of file to
> parse********end
> 
> 
> 
> 
> 
> *******************************grammar*************begin
> grammar MyGrammar;
> 
> 
> tokens {
> 	MODULE 	= 'module' ;
> 	OBJECT	= 'object' ;
> 	VIEW	= 'view' ;
> 	MENU_ENTRY= 'menu_entry' ;
> }
> 
> /*------------------------------------------------------------------
>  * PARSER RULES
>  *------------------------------------------------------------------*/
> 
> prog	: ( stat {print $stat.text} )+ ;
> 
> 
> stat	:	module NEWLINE
>     |	object NEWLINE
>     |	view NEWLINE
>     |	NEWLINE
>     ;
> 
> module	: MODULE '{' module_expr+ '}';
> 
> module_expr :   'name' ':' ID NEWLINE
>     |   'version' ':' ANY NEWLINE
>     ;
> 
> 
> object	: OBJECT '{' object_expr+ '}';
> 
> object_expr :   'module' ':' ID NEWLINE
>     |   'name' ':' ID NEWLINE
>     ;
> 
> view	: VIEW '{' view_expr+ '}';
> 
> view_expr :   'object' ':' ID NEWLINE
>     |   'name' ':' ID NEWLINE
>     ;
> 
> 
> 
> /*------------------------------------------------------------------
>  * LEXER RULES
>  *------------------------------------------------------------------*/
> 
> //NUMBER	: (DIGIT)+ ;
> 
> //WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ 	{ $channel =
HIDDEN;
> } ;
> 
> //fragment DIGIT	: '0'..'9' ;
> 
> 
> ID	:	('a'..'z'|'A'..'Z')+ ;
> 
> INT	:	'0'..'9'+ ;
> 
> NEWLINE	:	'\r'? '\n' ;
> 
> //WS	:	(' '|'\t'|'\n'|'\r')+ {self.skip()} ;
> WS : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ 	{ $channel = HIDDEN; } ;
> 
> ANY	:	('a'..'z'|'A'..'Z'|'0'..'9'|'.'|','|'_')+ ;
> *******************************grammar*************begin
> 
> 
> 
> 
> 
> 
> *******************************output*************begin
> line 3:4 missing NEWLINE at u'version'
> line 6:0 missing NEWLINE at u'object'
> module {
>     name:nameofmodule
>     version:1.0
> }
> line 8:4 missing NEWLINE at u'name'
> line 8:9 extraneous input u'nameofobject1' expecting ID line 10:13
> extraneous input u'nameofcolumn1' expecting ID line 13:4 mismatched input
> u'column' expecting NEWLINE object {
>     module:nameofmodule
>     name:nameofobject1
>     column {
>         name:nameofcolumn1
>         type:typeofcol1
>     }
>     column {
>         name:nameofcolumn2
>         type:typeofcol2
>     }
> }
> 
> 
> view {
>     object:nameofobject1
>     name:nameofview
>     type:typeofview
>     field {
>         column:nameofcolumn1
>     }
>     field {
>         column:nameofcolumn2
>     }
> }
> 
> 
> *******************************output*************end
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address



More information about the antlr-interest mailing list