[antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35 Speed/Memory
Jim Idle
jimi at temporal-wave.com
Fri May 21 22:00:37 PDT 2010
Well, you're comparing apples to cheese here. Bison/Flex do not create complicated tokens with method calls and so on, so ANTLR is winning here in reality. That said, I am going to implementing some of this stuff with a more fly weight pattern in v4. However, it sounds like you need to implement your own token stream that discards the tokens at certain points.
Jim
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bob
> Sent: Friday, May 21, 2010 7:47 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35
> Speed/Memory
>
> A tiny grammar was implemented in both Antlr and Bison+Flex (shown
> below).
>
> Test files repeating two lines (shown below) were made in 6 different
>
> sizes.
>
> One executable compiled with command line switch choosing either
>
> Antlr or Bison+Flex.
>
> One run with empty actions, one run with actions populated, to compare
>
> pure parsing with some actual work.
>
>
>
> Results:
>
>
> CPU time Peak Memory
>
> File Name File Size # modules #tokens Bison Antlr Bison Antlr
>
> Action bodies empty:
>
> source.v10m 460mb 10m 150m 28s 572k *
>
> source.v5m 230mb 5m 75m 15s 572k
> *
>
> source.v2.5m 115mb 2.5m 37m 7s 572k *
>
> source.v1m 46mb 1m 15m 2s
> 572k *
>
> source.v500k 23mb 500k 7.5m 1s 572k
> *
>
> source.v250k 11mb 250k 3.7m <1s 4s 572k
> 1.7g
> <-----------
>
> Action bodies populated:
>
> source.v250k 11mb 250k 3.7m 9s 13s 477m
> 1.7g
> <-----------
>
>
>
> * Antlr ran out of memory at 2gb
>
>
>
> Comments:
>
>
>
> 1. I expected the requirement that the entire file be resident in
> memory
>
> to be the memory glut. Surprise! Quick inspection suggests an
> initial
>
> tokenizing of the entire in-memory file consumes gobbs of memory,
> pushing
>
> a small footprint up to 1.7gb before releasing it. Only the smallest
>
> test file was under the runable 32 bit 2gb limit. Please fix!!
>
>
>
> 2. Speed is clearly slower than bison+flex, however empty actions don't
> make
>
> interesting programs. The test with actions enabled shows a 9s vs.
> 13s
>
> difference, considerable less than the empty action case.
>
>
>
> 3. If you've never setup bison+flex I have only one comment: !#@%$#.
> Two
>
> thumbs up for Antlr.
>
>
>
> Details:
>
>
>
> Vista 64, AMD opteron 2.4Ghz, 16gb ram
>
> Visual Studio 2008 Sp1
>
> One exe file with both Antlr and Bison+Flex, targeting 32 bit
>
> Full Optimization (/Ox), Inline Any suitable (/Ob2), Favor Small Code
> (/Os)
>
> Versions:
>
> Antlr 3.2
>
> Bison 2.4.2 LR(1)
>
> Flex 2.5.35
>
>
>
>
>
> ------------------- Input file -----------------------------
>
> module tiptop #(int p1=3, p2=4 );
>
> endmodule
>
> ... repeat to the indicated number of modules ...
>
> ------------------- Antlr Grammar --------------------------
>
> source_text : description ( description )*
>
> ;
>
> description : module_declaration
>
> ;
>
> module_declaration : module_ansi_header ENDMODULE ( ':'
> module_identifier )?
>
> { act_module(); }
>
> ;
>
> module_ansi_header : MODULE_KEYWORD module_identifier (
> parameter_port_list
> )? ';'
>
> ;
>
> module_identifier : identifier
>
> ;
>
> parameter_port_list
>
> : '#' '(' parameter_port_declaration ( ','
> parameter_port_declaration )*
> ')'
>
> | '#' '(' ')'
>
> ;
>
> parameter_port_declaration returns [void* node]
>
> scope {
>
> void* type;
>
> void* head;
>
> void* tail;
>
> }
>
> : data_type
>
> { $parameter_port_declaration::type = $data_type.node;
>
> $parameter_port_declaration::head=NULL; }
> list_of_param_assignments
>
> { $node = $parameter_port_declaration::head; }
>
> ;
>
> list_of_param_assignments
>
> : param_assignment ( ',' param_assignment )*
>
> ;
>
> param_assignment
>
> : parameter_identifier '=' constant_param_expression
>
> { act_param_assignment
>
> (
>
> & $parameter_port_declaration::head,
>
> & $parameter_port_declaration::tail,
>
> $parameter_identifier.node,
>
> $parameter_port_declaration::type,
>
> $constant_param_expression.node
>
> );
>
> }
>
> ;
>
> constant_param_expression returns [void* node]
>
> : constant_mintypmax_expression
>
> { $node = $constant_mintypmax_expression.node; }
>
> // | '$'
>
> ;
>
> constant_mintypmax_expression returns [void* node]
>
> : constant_expression
>
> { $node = $constant_expression.node; }
>
> ;
>
> // Deviate from LRM
>
> constant_expression returns [void* node]
>
> : expr { $node = $expr.node; }
>
> ;
>
> parameter_identifier returns [void* node]
>
> : identifier { $node = $identifier.node; }
>
> ;
>
> data_type returns [void* node]
>
> : integer_atom_type signing
>
> {$node=act_type($integer_atom_type.value,$signing.value);}
>
> | integer_atom_type
>
> {$node=act_type($integer_atom_type.value,-1);}
>
> ;
>
>
>
>
>
> expr returns [void* node] : NUMBER
>
> { $node = act_number( $NUMBER.text->chars ); }
>
> ;
>
>
>
> identifier returns [void* node] : SIMPLE_IDENTIFIER
>
> { $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); }
>
> ;
>
>
>
> /*------------------------------------------------------------------
>
> * LEXER RULES
>
> *------------------------------------------------------------------*/
>
>
>
> integer_atom_type returns [int value]
>
> : TokByte {$value = TokByte;}
>
> | TokShortint {$value = TokShortint;}
>
> | TokInt {$value = TokInt;}
>
> | TokLongint {$value = TokLongint;}
>
> | TokInteger {$value = TokInteger;}
>
> | TokTime {$value = TokTime;}
>
> ;
>
> signing returns [int value]
>
> : TokSigned {$value= TokSigned;}
>
> | TokUnsigned {$value= TokUnsigned;}
>
> ;
>
> MODULE_KEYWORD : (( 'module' )|('macromodule') )
>
> ;
>
> ENDMODULE : 'endmodule'
>
> ;
>
> SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
> 'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
>
> ;
>
>
>
> NUMBER : (DIGIT)+
>
> ;
>
>
>
> WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+
>
> {
>
> $channel = HIDDEN;
>
> }
>
> ;
>
> fragment
>
> DIGIT : '0'..'9'
>
> ;
>
> ------------------- Bison Grammar --------------------------
>
> %%
>
> source_text : description
>
> ;
>
> description
>
> : module_declaration
>
> | description module_declaration
>
> ;
>
> module_declaration
>
> : module_ansi_header TokEndmodule
>
> { act_module(); }
>
> | module_ansi_header TokEndmodule ':' module_identifier
>
> { act_module(); }
>
> ;
>
> module_ansi_header
>
> : TokModule module_identifier ';'
>
> | TokModule module_identifier parameter_port_list ';'
>
> ;
>
> module_identifier : identifier
>
> { $$ = $1; }
>
> ;
>
> parameter_port_list
>
> : '#' '(' parameter_port_list_recur ')'
>
> | '#' '(' ')'
>
> ;
>
> parameter_port_list_recur
>
> : parameter_port_declaration
>
> | parameter_port_list_recur ','
> parameter_port_declaration
>
> ;
>
> parameter_port_declaration
>
> : parameter_port_declaration_scope
>
> data_type { $1.type = $2; $1.head = NULL; }
>
> list_of_param_assignments { $$ = $1.head; }
>
> ;
>
> parameter_port_declaration_scope :
>
> ;
>
> list_of_param_assignments
>
> : nil nil param_assignment
>
> /* FIX:: need LR(2) here */
>
> | list_of_param_assignments ',' param_assignment
>
> ;
>
> param_assignment
>
> : parameter_identifier '=' constant_param_expression
>
> { act_param_assignment
>
> (
>
> & $<scope1>-3.head,
>
> & $<scope1>-3.tail,
>
> $1,
>
> $<scope1>-3.type,
>
> $3
>
> );
>
> }
>
> ;
>
> constant_param_expression
>
> : constant_mintypmax_expression { $$ = $1; }
>
> // | '$'
>
> ;
>
> constant_mintypmax_expression
>
> : constant_expression { $$ = $1; }
>
> ;
>
> // Deviate from LRM
>
> constant_expression : expr { $$ = $1; }
>
> ;
>
> parameter_identifier : identifier
>
> { $$ = $1; }
>
> ;
>
> data_type
>
> : integer_atom_type signing { $$ = act_typeB($1,$2); }
>
> | integer_atom_type { $$ = act_typeB($1,-1); }
>
> ;
>
> expr : TokNumber
>
> { $$ = act_number( $1 ); }
>
> ;
>
>
>
> nil : /* empty */
>
> ;
>
> identifier : TokIdentifier
>
> { $$ = act_identifier( $1 ); }
>
> ;
>
> integer_atom_type
>
> : TokByte { $$ = $1; }
>
> | TokShortint { $$ = $1; }
>
> | TokInt { $$ = $1; }
>
> | TokLongint { $$ = $1; }
>
> | TokInteger { $$ = $1; }
>
> | TokTime { $$ = $1; }
>
> ;
>
> signing : TokSigned { $$ = $1; }
>
> | TokUnsigned { $$ = $1; }
>
> ;
>
> %%
>
> ---------------------------------------------------------------
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
More information about the antlr-interest
mailing list