[antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35 Speed/Memory

Bob temporaryemail at comcast.net
Fri May 21 19:46:47 PDT 2010


A tiny grammar was implemented in both Antlr and Bison+Flex (shown below).

Test files repeating two lines (shown below) were made in 6 different

sizes.

One executable compiled with command line switch choosing either 

Antlr or Bison+Flex.

One run with empty actions, one run with actions populated, to compare

pure parsing with some actual work.

 

Results:

 
CPU time    Peak Memory

File Name     File Size # modules #tokens  Bison Antlr  Bison Antlr

Action bodies empty:

source.v10m     460mb      10m      150m         28s          572k  *

source.v5m       230mb       5m        75m           15s          572k  *

source.v2.5m    115mb       2.5m     37m           7s           572k  *

source.v1m       46mb         1m        15m            2s           572k  *

source.v500k    23mb        500k      7.5m          1s            572k  *

source.v250k    11mb        250k      3.7m        <1s   4s     572k  1.7g
<-----------

Action bodies populated:

source.v250k    11mb        250k      3.7m         9s   13s    477m  1.7g
<-----------

 

* Antlr ran out of memory at 2gb

 

Comments:

 

1. I expected the requirement that the entire file be resident in memory

   to be the memory glut. Surprise! Quick inspection suggests an initial

   tokenizing of the entire in-memory file consumes gobbs of memory, pushing

   a small footprint up to 1.7gb before releasing it. Only the smallest

   test file was under the runable 32 bit 2gb limit.     Please fix!!

 

2. Speed is clearly slower than bison+flex, however empty actions don't make

   interesting programs. The test with actions enabled shows a 9s vs. 13s

   difference, considerable less than the empty action case.

 

3. If you've never setup bison+flex I have only one comment: !#@%$#. Two

   thumbs up for Antlr.

 

Details:

 

  Vista 64, AMD opteron 2.4Ghz, 16gb ram

  Visual Studio 2008 Sp1

  One exe file with both Antlr and Bison+Flex, targeting 32 bit

  Full Optimization (/Ox), Inline Any suitable (/Ob2), Favor Small Code
(/Os)

  Versions:

    Antlr 3.2

    Bison 2.4.2 LR(1)

    Flex  2.5.35

 

 

------------------- Input file -----------------------------

module tiptop #(int p1=3, p2=4 );

endmodule

... repeat to the indicated number of modules ...

------------------- Antlr Grammar --------------------------

source_text : description ( description )*

    ;

description : module_declaration 

    ;

module_declaration : module_ansi_header ENDMODULE ( ':' module_identifier )?

        { act_module(); }

    ;

module_ansi_header : MODULE_KEYWORD module_identifier ( parameter_port_list
)? ';'

    ;

module_identifier : identifier

    ;

parameter_port_list 

    : '#' '(' parameter_port_declaration ( ',' parameter_port_declaration )*
')'

    | '#' '(' ')'

    ;

parameter_port_declaration returns [void* node]

scope {

    void* type;

    void* head;

    void* tail;

}

    : data_type

        { $parameter_port_declaration::type = $data_type.node;

            $parameter_port_declaration::head=NULL; }
list_of_param_assignments

        { $node = $parameter_port_declaration::head; }

    ;

list_of_param_assignments

    : param_assignment ( ',' param_assignment )*

    ;

param_assignment

    : parameter_identifier '=' constant_param_expression

        { act_param_assignment

            (

                & $parameter_port_declaration::head,

                & $parameter_port_declaration::tail,

                $parameter_identifier.node,

                $parameter_port_declaration::type,

                $constant_param_expression.node

            );

        }

    ;

constant_param_expression returns [void* node]

    : constant_mintypmax_expression

        { $node = $constant_mintypmax_expression.node; }

//    | '$'

    ;

constant_mintypmax_expression returns [void* node]

    : constant_expression

        { $node = $constant_expression.node; }

    ;

// Deviate from LRM

constant_expression returns [void* node]

    : expr { $node = $expr.node; }

    ;

parameter_identifier returns [void* node]

    : identifier { $node = $identifier.node; }

    ;

data_type returns [void* node]

    : integer_atom_type signing

       {$node=act_type($integer_atom_type.value,$signing.value);}

    | integer_atom_type

       {$node=act_type($integer_atom_type.value,-1);}

    ;

 

 

expr returns [void* node] : NUMBER

        { $node = act_number( $NUMBER.text->chars ); }

        ;

 

identifier returns [void* node] : SIMPLE_IDENTIFIER

        { $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); }

    ;

 

/*------------------------------------------------------------------

 * LEXER RULES

 *------------------------------------------------------------------*/

 

integer_atom_type returns [int value]

    : TokByte       {$value = TokByte;}

    | TokShortint   {$value = TokShortint;}

    | TokInt        {$value = TokInt;}

    | TokLongint    {$value = TokLongint;}

    | TokInteger    {$value = TokInteger;}

    | TokTime       {$value = TokTime;}

    ;

signing returns [int value]

    : TokSigned     {$value= TokSigned;}

    | TokUnsigned   {$value= TokUnsigned;}

    ;

MODULE_KEYWORD  : (( 'module' )|('macromodule') )

    ;

ENDMODULE       : 'endmodule'

    ;

SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*

    ;

 

NUMBER : (DIGIT)+ 

            ;

 

WHITESPACE  : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+

              {

                 $channel = HIDDEN;

              }

            ;

fragment

DIGIT         : '0'..'9'

            ;

------------------- Bison Grammar --------------------------

%%

source_text : description

    ;

description

                : module_declaration

                | description module_declaration

    ;

module_declaration

                : module_ansi_header TokEndmodule

                { act_module(); }

                | module_ansi_header TokEndmodule ':' module_identifier

                { act_module(); }

    ;

module_ansi_header

                : TokModule module_identifier ';'

                | TokModule module_identifier parameter_port_list ';'

    ;

module_identifier : identifier

                { $$ = $1; }

    ;

parameter_port_list 

    : '#' '(' parameter_port_list_recur ')'

    | '#' '(' ')'

    ;

parameter_port_list_recur

                : parameter_port_declaration

                | parameter_port_list_recur ',' parameter_port_declaration

                ;

parameter_port_declaration

                : parameter_port_declaration_scope

                    data_type { $1.type = $2; $1.head = NULL; }

                      list_of_param_assignments { $$ = $1.head; }

    ;

parameter_port_declaration_scope :

                ;

list_of_param_assignments

    : nil                       nil param_assignment

/* FIX:: need LR(2) here */

    | list_of_param_assignments ',' param_assignment

    ;

param_assignment

    : parameter_identifier '=' constant_param_expression

    { act_param_assignment

      (

       & $<scope1>-3.head,

       & $<scope1>-3.tail,

       $1,

       $<scope1>-3.type,

       $3

       );

    }

    ;

constant_param_expression

                : constant_mintypmax_expression { $$ = $1; }

//    | '$'

    ;

constant_mintypmax_expression

    : constant_expression { $$ = $1; }

    ;

// Deviate from LRM

constant_expression : expr { $$ = $1; }

    ;

parameter_identifier : identifier

    { $$ = $1; }

    ;

data_type

                : integer_atom_type signing { $$ = act_typeB($1,$2); }

                | integer_atom_type         { $$ = act_typeB($1,-1); }

                ;

expr       : TokNumber

                { $$ = act_number( $1 ); }

                ;

 

nil           : /* empty */

    ;

identifier : TokIdentifier

                { $$ = act_identifier( $1 ); }

    ;

integer_atom_type

                : TokByte     { $$ = $1; }

                | TokShortint { $$ = $1; }

                | TokInt      { $$ = $1; }

                | TokLongint  { $$ = $1; }

                | TokInteger  { $$ = $1; }

                | TokTime     { $$ = $1; }

                ;

signing  : TokSigned   { $$ = $1; }

                | TokUnsigned { $$ = $1; }

                ;

%%

---------------------------------------------------------------



More information about the antlr-interest mailing list