[antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35 Speed/Memory

Jim Idle jimi at temporal-wave.com
Fri May 21 22:00:37 PDT 2010


Well, you're comparing apples to cheese here. Bison/Flex do not create complicated tokens with method calls and so on, so ANTLR is winning here in reality. That said, I am going to implementing some of this stuff with a more fly weight pattern in v4. However, it sounds like you need to implement your own token stream that discards the tokens at certain points. 

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bob
> Sent: Friday, May 21, 2010 7:47 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35
> Speed/Memory
> 
> A tiny grammar was implemented in both Antlr and Bison+Flex (shown
> below).
> 
> Test files repeating two lines (shown below) were made in 6 different
> 
> sizes.
> 
> One executable compiled with command line switch choosing either
> 
> Antlr or Bison+Flex.
> 
> One run with empty actions, one run with actions populated, to compare
> 
> pure parsing with some actual work.
> 
> 
> 
> Results:
> 
> 
> CPU time    Peak Memory
> 
> File Name     File Size # modules #tokens  Bison Antlr  Bison Antlr
> 
> Action bodies empty:
> 
> source.v10m     460mb      10m      150m         28s          572k  *
> 
> source.v5m       230mb       5m        75m           15s          572k
> *
> 
> source.v2.5m    115mb       2.5m     37m           7s           572k  *
> 
> source.v1m       46mb         1m        15m            2s
> 572k  *
> 
> source.v500k    23mb        500k      7.5m          1s            572k
> *
> 
> source.v250k    11mb        250k      3.7m        <1s   4s     572k
> 1.7g
> <-----------
> 
> Action bodies populated:
> 
> source.v250k    11mb        250k      3.7m         9s   13s    477m
> 1.7g
> <-----------
> 
> 
> 
> * Antlr ran out of memory at 2gb
> 
> 
> 
> Comments:
> 
> 
> 
> 1. I expected the requirement that the entire file be resident in
> memory
> 
>    to be the memory glut. Surprise! Quick inspection suggests an
> initial
> 
>    tokenizing of the entire in-memory file consumes gobbs of memory,
> pushing
> 
>    a small footprint up to 1.7gb before releasing it. Only the smallest
> 
>    test file was under the runable 32 bit 2gb limit.     Please fix!!
> 
> 
> 
> 2. Speed is clearly slower than bison+flex, however empty actions don't
> make
> 
>    interesting programs. The test with actions enabled shows a 9s vs.
> 13s
> 
>    difference, considerable less than the empty action case.
> 
> 
> 
> 3. If you've never setup bison+flex I have only one comment: !#@%$#.
> Two
> 
>    thumbs up for Antlr.
> 
> 
> 
> Details:
> 
> 
> 
>   Vista 64, AMD opteron 2.4Ghz, 16gb ram
> 
>   Visual Studio 2008 Sp1
> 
>   One exe file with both Antlr and Bison+Flex, targeting 32 bit
> 
>   Full Optimization (/Ox), Inline Any suitable (/Ob2), Favor Small Code
> (/Os)
> 
>   Versions:
> 
>     Antlr 3.2
> 
>     Bison 2.4.2 LR(1)
> 
>     Flex  2.5.35
> 
> 
> 
> 
> 
> ------------------- Input file -----------------------------
> 
> module tiptop #(int p1=3, p2=4 );
> 
> endmodule
> 
> ... repeat to the indicated number of modules ...
> 
> ------------------- Antlr Grammar --------------------------
> 
> source_text : description ( description )*
> 
>     ;
> 
> description : module_declaration
> 
>     ;
> 
> module_declaration : module_ansi_header ENDMODULE ( ':'
> module_identifier )?
> 
>         { act_module(); }
> 
>     ;
> 
> module_ansi_header : MODULE_KEYWORD module_identifier (
> parameter_port_list
> )? ';'
> 
>     ;
> 
> module_identifier : identifier
> 
>     ;
> 
> parameter_port_list
> 
>     : '#' '(' parameter_port_declaration ( ','
> parameter_port_declaration )*
> ')'
> 
>     | '#' '(' ')'
> 
>     ;
> 
> parameter_port_declaration returns [void* node]
> 
> scope {
> 
>     void* type;
> 
>     void* head;
> 
>     void* tail;
> 
> }
> 
>     : data_type
> 
>         { $parameter_port_declaration::type = $data_type.node;
> 
>             $parameter_port_declaration::head=NULL; }
> list_of_param_assignments
> 
>         { $node = $parameter_port_declaration::head; }
> 
>     ;
> 
> list_of_param_assignments
> 
>     : param_assignment ( ',' param_assignment )*
> 
>     ;
> 
> param_assignment
> 
>     : parameter_identifier '=' constant_param_expression
> 
>         { act_param_assignment
> 
>             (
> 
>                 & $parameter_port_declaration::head,
> 
>                 & $parameter_port_declaration::tail,
> 
>                 $parameter_identifier.node,
> 
>                 $parameter_port_declaration::type,
> 
>                 $constant_param_expression.node
> 
>             );
> 
>         }
> 
>     ;
> 
> constant_param_expression returns [void* node]
> 
>     : constant_mintypmax_expression
> 
>         { $node = $constant_mintypmax_expression.node; }
> 
> //    | '$'
> 
>     ;
> 
> constant_mintypmax_expression returns [void* node]
> 
>     : constant_expression
> 
>         { $node = $constant_expression.node; }
> 
>     ;
> 
> // Deviate from LRM
> 
> constant_expression returns [void* node]
> 
>     : expr { $node = $expr.node; }
> 
>     ;
> 
> parameter_identifier returns [void* node]
> 
>     : identifier { $node = $identifier.node; }
> 
>     ;
> 
> data_type returns [void* node]
> 
>     : integer_atom_type signing
> 
>        {$node=act_type($integer_atom_type.value,$signing.value);}
> 
>     | integer_atom_type
> 
>        {$node=act_type($integer_atom_type.value,-1);}
> 
>     ;
> 
> 
> 
> 
> 
> expr returns [void* node] : NUMBER
> 
>         { $node = act_number( $NUMBER.text->chars ); }
> 
>         ;
> 
> 
> 
> identifier returns [void* node] : SIMPLE_IDENTIFIER
> 
>         { $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); }
> 
>     ;
> 
> 
> 
> /*------------------------------------------------------------------
> 
>  * LEXER RULES
> 
>  *------------------------------------------------------------------*/
> 
> 
> 
> integer_atom_type returns [int value]
> 
>     : TokByte       {$value = TokByte;}
> 
>     | TokShortint   {$value = TokShortint;}
> 
>     | TokInt        {$value = TokInt;}
> 
>     | TokLongint    {$value = TokLongint;}
> 
>     | TokInteger    {$value = TokInteger;}
> 
>     | TokTime       {$value = TokTime;}
> 
>     ;
> 
> signing returns [int value]
> 
>     : TokSigned     {$value= TokSigned;}
> 
>     | TokUnsigned   {$value= TokUnsigned;}
> 
>     ;
> 
> MODULE_KEYWORD  : (( 'module' )|('macromodule') )
> 
>     ;
> 
> ENDMODULE       : 'endmodule'
> 
>     ;
> 
> SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
> 'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
> 
>     ;
> 
> 
> 
> NUMBER : (DIGIT)+
> 
>             ;
> 
> 
> 
> WHITESPACE  : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+
> 
>               {
> 
>                  $channel = HIDDEN;
> 
>               }
> 
>             ;
> 
> fragment
> 
> DIGIT         : '0'..'9'
> 
>             ;
> 
> ------------------- Bison Grammar --------------------------
> 
> %%
> 
> source_text : description
> 
>     ;
> 
> description
> 
>                 : module_declaration
> 
>                 | description module_declaration
> 
>     ;
> 
> module_declaration
> 
>                 : module_ansi_header TokEndmodule
> 
>                 { act_module(); }
> 
>                 | module_ansi_header TokEndmodule ':' module_identifier
> 
>                 { act_module(); }
> 
>     ;
> 
> module_ansi_header
> 
>                 : TokModule module_identifier ';'
> 
>                 | TokModule module_identifier parameter_port_list ';'
> 
>     ;
> 
> module_identifier : identifier
> 
>                 { $$ = $1; }
> 
>     ;
> 
> parameter_port_list
> 
>     : '#' '(' parameter_port_list_recur ')'
> 
>     | '#' '(' ')'
> 
>     ;
> 
> parameter_port_list_recur
> 
>                 : parameter_port_declaration
> 
>                 | parameter_port_list_recur ','
> parameter_port_declaration
> 
>                 ;
> 
> parameter_port_declaration
> 
>                 : parameter_port_declaration_scope
> 
>                     data_type { $1.type = $2; $1.head = NULL; }
> 
>                       list_of_param_assignments { $$ = $1.head; }
> 
>     ;
> 
> parameter_port_declaration_scope :
> 
>                 ;
> 
> list_of_param_assignments
> 
>     : nil                       nil param_assignment
> 
> /* FIX:: need LR(2) here */
> 
>     | list_of_param_assignments ',' param_assignment
> 
>     ;
> 
> param_assignment
> 
>     : parameter_identifier '=' constant_param_expression
> 
>     { act_param_assignment
> 
>       (
> 
>        & $<scope1>-3.head,
> 
>        & $<scope1>-3.tail,
> 
>        $1,
> 
>        $<scope1>-3.type,
> 
>        $3
> 
>        );
> 
>     }
> 
>     ;
> 
> constant_param_expression
> 
>                 : constant_mintypmax_expression { $$ = $1; }
> 
> //    | '$'
> 
>     ;
> 
> constant_mintypmax_expression
> 
>     : constant_expression { $$ = $1; }
> 
>     ;
> 
> // Deviate from LRM
> 
> constant_expression : expr { $$ = $1; }
> 
>     ;
> 
> parameter_identifier : identifier
> 
>     { $$ = $1; }
> 
>     ;
> 
> data_type
> 
>                 : integer_atom_type signing { $$ = act_typeB($1,$2); }
> 
>                 | integer_atom_type         { $$ = act_typeB($1,-1); }
> 
>                 ;
> 
> expr       : TokNumber
> 
>                 { $$ = act_number( $1 ); }
> 
>                 ;
> 
> 
> 
> nil           : /* empty */
> 
>     ;
> 
> identifier : TokIdentifier
> 
>                 { $$ = act_identifier( $1 ); }
> 
>     ;
> 
> integer_atom_type
> 
>                 : TokByte     { $$ = $1; }
> 
>                 | TokShortint { $$ = $1; }
> 
>                 | TokInt      { $$ = $1; }
> 
>                 | TokLongint  { $$ = $1; }
> 
>                 | TokInteger  { $$ = $1; }
> 
>                 | TokTime     { $$ = $1; }
> 
>                 ;
> 
> signing  : TokSigned   { $$ = $1; }
> 
>                 | TokUnsigned { $$ = $1; }
> 
>                 ;
> 
> %%
> 
> ---------------------------------------------------------------
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list