[antlr-interest] Re: Short circuit of the lexer

Terence Parr parrt at jguru.com
Sat Jan 18 11:03:01 PST 2003


On Saturday, January 18, 2003, at 05:06 AM, xadeck 
<decoret at graphics.lcs.mit.edu> wrote:

> --- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> wrote:
>>
>>
>>
>> Hi.  First, can you give us the simple rules that matched the array?
>> There may be a more efficient way to specify the language.
>
> The grammar is a complex stuff. I extracted the part for arrays:
> I guess values rule could be write as (INT)* but then I don't know how
> to push the integers read into my result vector. I don't want to build
> an AST unless really necessary (some files I parse have 180 000 lines
> and the AST may result in high memory cost).

Are you using the latest 2.7.2 stuff or 2.7.1?  I think 2.7.2 is faster 
:)

Also, (INT)* is definitely more efficient than the tail recursion you 
are using.  just add the action within the loop:

( i:INT {result.push_back(atoi(i->getText().c_str()));} )*

Put that in rule decl instead of referring to values and you should be 
good to go.  Let me know if this works.  The tail recursion will build 
a HUGE stack of method invocation records if you have 180k lines...very 
very inefficient.  Try the loop :)

Ter

> options {
>     language="Cpp";
> }
> {
> #include <deque>
>
> using namespace std;
>
> static vector<int> result;
>
> }
> class MyParser extends Parser;
> options
> {
>     k=2;
> }
> file:  (decl)*
>     ;
>
> decl : Id LBRACKET values RBRACKET
>     ;
>
> values: i1:INT
>         {
>             result.push_back(atoi(i1->getText().c_str()));
>         }
>     | i2:INT
>         {
>             result.push_back(atoi(i2->getText().c_str()));
>         }
>         values
>     ;
> class MyLexer extends Lexer;
> options
> {
>     k=2;
> }
> {
>     protected:
>     bool parsingList;
> }
> LBRACKET : '[';
> RBRACKET : ']';
>
> WS  :   (' '
>     |   '\t'
>     |   '\n'
>     |   '\r')
>         { $setType(ANTLR_USE_NAMESPACE(antlr)Token::SKIP); }
>     ;
>
> INT :   ('0'..'9')+
>     ;
>
> Id : ('a'..'z')+
>     ;
>
>
>
>> Terence
>> --
>> Co-founder, http://www.jguru.com
>> Creator, ANTLR Parser Generator: http://www.antlr.org
>> Lecturer in Comp. Sci., University of San Francisco
>
>
>
>
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/
>
>
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Lecturer in Comp. Sci., University of San Francisco


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list