[antlr-interest] Parsing Large Files

Kumar, Amitesh Amitesh.Kumar at standardbank.com
Thu Apr 1 09:07:16 PDT 2010


Ive got ANTLR version 3.2 it doesn't seem to have UnbufferedTokenStream, there doesn't seem to be a newer version on the site

Amitesh Kumar |CIB Integration | Business Infrastructure Technology | Standard Bank CIB International | Ground Floor, 20 Gresham Street, London, EC2V 7JE 
T: +44 [0]203 145 5575 | E: amitesh.kumar at standardbank.com

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
Sent: 01 April 2010 16:55
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Parsing Large Files

Just replace new CommonTokenStream(...).

But you will need to fix your grammar before it will all work of course.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest- 
> bounces at antlr.org] On Behalf Of Kumar, Amitesh
> Sent: Thursday, April 01, 2010 8:52 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parsing Large Files
> 
> Great where is this UnbufferedTokenStream.
> 
> Cheers Jim
> 
> 
> Amitesh Kumar |CIB Integration | Business Infrastructure Technology | 
> Standard Bank CIB International | Ground Floor, 20 Gresham Street, 
> London, EC2V 7JE
> T: +44 [0]203 145 5575 | E: amitesh.kumar at standardbank.com
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest- 
> bounces at antlr.org] On Behalf Of Jim Idle
> Sent: 01 April 2010 16:27
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parsing Large Files
> 
> Actually, I think that if you use UnbufferedTokenStream(), that this 
> will pretty much do what you want already, but it is easy to derive 
> from one of the token streams, and add methods tah can discard 
> buffered tokens once you know you have dealt with them.
> 
> 
> Also, if you have comma separated files, then it is usually easier to 
> use awk. Finally, your grammar has myriad lexical ambiguities and I am 
> afraid it is not going to work as you have written it. You cannot have 
> more than one lexer rule that matches the same text as the lexer is 
> not syntax directed, it just tokenizes what it sees.
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest- 
> > bounces at antlr.org] On Behalf Of Marcin Rzeznicki
> > Sent: Thursday, April 01, 2010 8:02 AM
> > To: Kumar, Amitesh
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Parsing Large Files
> >
> > On Thu, Apr 1, 2010 at 4:26 PM, Kumar, Amitesh 
> > <Amitesh.Kumar at standardbank.com> wrote:
> > >
> >
> > >
> > > But my general issue is that not all my data is a simple CSV file
> > some
> > > will be multi line records. Hence I didn't want to keep a record 
> > > of
> > the
> > > tokens.
> > > Any ideas . By the way thanks for your reply.
> > >
> >
> > Hi
> > You can easily implement your own TokenStream that is optimized for 
> > your use case eg. does not try to keep everything in one big array.
> If
> > you explore this possibility, you will quickly discover that it is 
> > very easy thing to do and test. Hope it helps.
> >
> >
> > > Cheers
> > > Kumaap0
> >
> >
> > --
> > Greetings
> > Marcin Rzeźnicki
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> > email-address
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address
> 
> 
> **********************************************************************
> *
> ******
> More information on Standard Bank is available at www.standardbank.com
> 
> Everything in this email and any attachments relating to the official 
> business of Standard Bank Group Limited and any or all subsidiaries, 
> ("the Company"), is proprietary to the Company. It is confidential, 
> legally privileged and protected by relevant laws. The Company does 
> not own and endorse any other content.
> Views and opinions are those of the sender unless clearly stated as 
> being that of the Company.
> 
> The person or persons addressed in this email are the sole authorised 
> recipient. Please notify the sender immediately if it has 
> unintentionally, or inadvertently reached you and do not read, 
> disclose or use the content in any way and delete this e-mail from 
> your system.
> 
> The Company cannot ensure that the integrity of this email has been 
> maintained nor that it is free of errors, virus, interception or 
> interference.
> The sender therefore does not accept liability for any errors or 
> omissions in the contents of this message which arise as a result of 
> e-mail transmission.
> If verification is required please request a hard-copy version. This 
> message is provided for informational purposes and should not be 
> construed as a solicitation or offer to buy or sell any securities or 
> related financial instruments.
> **********************************************************************
> *
> ******
> 
> 
> This message has been scanned for viruses by BlackSpider MailControl - 
> www.blackspider.com
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address




List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list