[antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35 Speed/Memory
Bob
temporaryemail at comcast.net
Sat May 22 11:33:13 PDT 2010
Perhaps you are correct regarding the strings. Looking at the antlr
generated "identifier" rule code, I see
"SIMPLE_IDENTIFIER12" is not freed. Is there a mechanism whereby I can free
the string when the rule is complete?
After reading the file into memory, the process size is ~11meg (size of
input file), then it rises to 853Meg before the 1st "module" .. "endmodule"
from the input is recognized. As subsequent "module".."endmodule"'s are
recognized memory rises to 1.2+Gb.
If you have a solution to free the strings when a rule is complete, I'll try
that and see where it stands. The expansion from 11Meg to over 1.2Gb is
quite a bit for strings!
Rule:
identifier returns [void* node] : SIMPLE_IDENTIFIER
{ $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); }
;
Code:
static void*
identifier(pNuVParser ctx)
{
void* node = NULL;
pANTLR3_COMMON_TOKEN SIMPLE_IDENTIFIER12;
/* Initialize rule variables
*/
SIMPLE_IDENTIFIER12 = NULL;
{
// NuV.g:102:33: ( SIMPLE_IDENTIFIER )
// NuV.g:102:35: SIMPLE_IDENTIFIER
{
SIMPLE_IDENTIFIER12 = (pANTLR3_COMMON_TOKEN)
MATCHT(SIMPLE_IDENTIFIER, &FOLLOW_SIMPLE_IDENTIFIER_in_identifier608);
if (HASEXCEPTION())
{
goto ruleidentifierEx;
}
{
node= act_identifier(
(SIMPLE_IDENTIFIER12->getText(SIMPLE_IDENTIFIER12))->chars );
}
}
}
// This is where rules clean up and exit
//
goto ruleidentifierEx; /* Prevent compiler warnings */
ruleidentifierEx: ;
if (HASEXCEPTION())
{
PREPORTERROR();
PRECOVER();
}
return node;
}
-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: Saturday, May 22, 2010 10:37 AM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35
Speed/Memory
The other thing to add here is that you are using the $xxx.text references,
and these do not free up the string memory until you free the parser. With
this many inputs, the memory usage you are seeing is probably this first and
not the tokens.
Jim
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bob
> Sent: Friday, May 21, 2010 7:47 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Antlr 3.2 vs. Bison 2.4.2+Flex 2.5.35
> Speed/Memory
>
> A tiny grammar was implemented in both Antlr and Bison+Flex (shown
> below).
>
> Test files repeating two lines (shown below) were made in 6 different
>
> sizes.
>
> One executable compiled with command line switch choosing either
>
> Antlr or Bison+Flex.
>
> One run with empty actions, one run with actions populated, to compare
>
> pure parsing with some actual work.
>
>
>
> Results:
>
>
> CPU time Peak Memory
>
> File Name File Size # modules #tokens Bison Antlr Bison Antlr
>
> Action bodies empty:
>
> source.v10m 460mb 10m 150m 28s 572k *
>
> source.v5m 230mb 5m 75m 15s 572k
> *
>
> source.v2.5m 115mb 2.5m 37m 7s 572k *
>
> source.v1m 46mb 1m 15m 2s
> 572k *
>
> source.v500k 23mb 500k 7.5m 1s 572k
> *
>
> source.v250k 11mb 250k 3.7m <1s 4s 572k
> 1.7g
> <-----------
>
> Action bodies populated:
>
> source.v250k 11mb 250k 3.7m 9s 13s 477m
> 1.7g
> <-----------
>
>
>
> * Antlr ran out of memory at 2gb
>
>
>
> Comments:
>
>
>
> 1. I expected the requirement that the entire file be resident in
> memory
>
> to be the memory glut. Surprise! Quick inspection suggests an
> initial
>
> tokenizing of the entire in-memory file consumes gobbs of memory,
> pushing
>
> a small footprint up to 1.7gb before releasing it. Only the smallest
>
> test file was under the runable 32 bit 2gb limit. Please fix!!
>
>
>
> 2. Speed is clearly slower than bison+flex, however empty actions don't
> make
>
> interesting programs. The test with actions enabled shows a 9s vs.
> 13s
>
> difference, considerable less than the empty action case.
>
>
>
> 3. If you've never setup bison+flex I have only one comment: !#@%$#.
> Two
>
> thumbs up for Antlr.
>
>
>
> Details:
>
>
>
> Vista 64, AMD opteron 2.4Ghz, 16gb ram
>
> Visual Studio 2008 Sp1
>
> One exe file with both Antlr and Bison+Flex, targeting 32 bit
>
> Full Optimization (/Ox), Inline Any suitable (/Ob2), Favor Small Code
> (/Os)
>
> Versions:
>
> Antlr 3.2
>
> Bison 2.4.2 LR(1)
>
> Flex 2.5.35
>
>
>
>
>
> ------------------- Input file -----------------------------
>
> module tiptop #(int p1=3, p2=4 );
>
> endmodule
>
> ... repeat to the indicated number of modules ...
>
> ------------------- Antlr Grammar --------------------------
>
> source_text : description ( description )*
>
> ;
>
> description : module_declaration
>
> ;
>
> module_declaration : module_ansi_header ENDMODULE ( ':'
> module_identifier )?
>
> { act_module(); }
>
> ;
>
> module_ansi_header : MODULE_KEYWORD module_identifier (
> parameter_port_list
> )? ';'
>
> ;
>
> module_identifier : identifier
>
> ;
>
> parameter_port_list
>
> : '#' '(' parameter_port_declaration ( ','
> parameter_port_declaration )*
> ')'
>
> | '#' '(' ')'
>
> ;
>
> parameter_port_declaration returns [void* node]
>
> scope {
>
> void* type;
>
> void* head;
>
> void* tail;
>
> }
>
> : data_type
>
> { $parameter_port_declaration::type = $data_type.node;
>
> $parameter_port_declaration::head=NULL; }
> list_of_param_assignments
>
> { $node = $parameter_port_declaration::head; }
>
> ;
>
> list_of_param_assignments
>
> : param_assignment ( ',' param_assignment )*
>
> ;
>
> param_assignment
>
> : parameter_identifier '=' constant_param_expression
>
> { act_param_assignment
>
> (
>
> & $parameter_port_declaration::head,
>
> & $parameter_port_declaration::tail,
>
> $parameter_identifier.node,
>
> $parameter_port_declaration::type,
>
> $constant_param_expression.node
>
> );
>
> }
>
> ;
>
> constant_param_expression returns [void* node]
>
> : constant_mintypmax_expression
>
> { $node = $constant_mintypmax_expression.node; }
>
> // | '$'
>
> ;
>
> constant_mintypmax_expression returns [void* node]
>
> : constant_expression
>
> { $node = $constant_expression.node; }
>
> ;
>
> // Deviate from LRM
>
> constant_expression returns [void* node]
>
> : expr { $node = $expr.node; }
>
> ;
>
> parameter_identifier returns [void* node]
>
> : identifier { $node = $identifier.node; }
>
> ;
>
> data_type returns [void* node]
>
> : integer_atom_type signing
>
> {$node=act_type($integer_atom_type.value,$signing.value);}
>
> | integer_atom_type
>
> {$node=act_type($integer_atom_type.value,-1);}
>
> ;
>
>
>
>
>
> expr returns [void* node] : NUMBER
>
> { $node = act_number( $NUMBER.text->chars ); }
>
> ;
>
>
>
> identifier returns [void* node] : SIMPLE_IDENTIFIER
>
> { $node = act_identifier( $SIMPLE_IDENTIFIER.text->chars ); }
>
> ;
>
>
>
> /*------------------------------------------------------------------
>
> * LEXER RULES
>
> *------------------------------------------------------------------*/
>
>
>
> integer_atom_type returns [int value]
>
> : TokByte {$value = TokByte;}
>
> | TokShortint {$value = TokShortint;}
>
> | TokInt {$value = TokInt;}
>
> | TokLongint {$value = TokLongint;}
>
> | TokInteger {$value = TokInteger;}
>
> | TokTime {$value = TokTime;}
>
> ;
>
> signing returns [int value]
>
> : TokSigned {$value= TokSigned;}
>
> | TokUnsigned {$value= TokUnsigned;}
>
> ;
>
> MODULE_KEYWORD : (( 'module' )|('macromodule') )
>
> ;
>
> ENDMODULE : 'endmodule'
>
> ;
>
> SIMPLE_IDENTIFIER : ( 'a'..'z'|'A'..'Z'|'_' ) (
> 'a'..'z'|'A'..'Z'|'_'|'0'..'9'|'$')*
>
> ;
>
>
>
> NUMBER : (DIGIT)+
>
> ;
>
>
>
> WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+
>
> {
>
> $channel = HIDDEN;
>
> }
>
> ;
>
> fragment
>
> DIGIT : '0'..'9'
>
> ;
>
> ------------------- Bison Grammar --------------------------
>
> %%
>
> source_text : description
>
> ;
>
> description
>
> : module_declaration
>
> | description module_declaration
>
> ;
>
> module_declaration
>
> : module_ansi_header TokEndmodule
>
> { act_module(); }
>
> | module_ansi_header TokEndmodule ':' module_identifier
>
> { act_module(); }
>
> ;
>
> module_ansi_header
>
> : TokModule module_identifier ';'
>
> | TokModule module_identifier parameter_port_list ';'
>
> ;
>
> module_identifier : identifier
>
> { $$ = $1; }
>
> ;
>
> parameter_port_list
>
> : '#' '(' parameter_port_list_recur ')'
>
> | '#' '(' ')'
>
> ;
>
> parameter_port_list_recur
>
> : parameter_port_declaration
>
> | parameter_port_list_recur ','
> parameter_port_declaration
>
> ;
>
> parameter_port_declaration
>
> : parameter_port_declaration_scope
>
> data_type { $1.type = $2; $1.head = NULL; }
>
> list_of_param_assignments { $$ = $1.head; }
>
> ;
>
> parameter_port_declaration_scope :
>
> ;
>
> list_of_param_assignments
>
> : nil nil param_assignment
>
> /* FIX:: need LR(2) here */
>
> | list_of_param_assignments ',' param_assignment
>
> ;
>
> param_assignment
>
> : parameter_identifier '=' constant_param_expression
>
> { act_param_assignment
>
> (
>
> & $<scope1>-3.head,
>
> & $<scope1>-3.tail,
>
> $1,
>
> $<scope1>-3.type,
>
> $3
>
> );
>
> }
>
> ;
>
> constant_param_expression
>
> : constant_mintypmax_expression { $$ = $1; }
>
> // | '$'
>
> ;
>
> constant_mintypmax_expression
>
> : constant_expression { $$ = $1; }
>
> ;
>
> // Deviate from LRM
>
> constant_expression : expr { $$ = $1; }
>
> ;
>
> parameter_identifier : identifier
>
> { $$ = $1; }
>
> ;
>
> data_type
>
> : integer_atom_type signing { $$ = act_typeB($1,$2); }
>
> | integer_atom_type { $$ = act_typeB($1,-1); }
>
> ;
>
> expr : TokNumber
>
> { $$ = act_number( $1 ); }
>
> ;
>
>
>
> nil : /* empty */
>
> ;
>
> identifier : TokIdentifier
>
> { $$ = act_identifier( $1 ); }
>
> ;
>
> integer_atom_type
>
> : TokByte { $$ = $1; }
>
> | TokShortint { $$ = $1; }
>
> | TokInt { $$ = $1; }
>
> | TokLongint { $$ = $1; }
>
> | TokInteger { $$ = $1; }
>
> | TokTime { $$ = $1; }
>
> ;
>
> signing : TokSigned { $$ = $1; }
>
> | TokUnsigned { $$ = $1; }
>
> ;
>
> %%
>
> ---------------------------------------------------------------
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list