[antlr-interest] Parsing Large Files

Nikolay Ognyanov nikolay.ognyanov at travelstoremaker.com
Thu Apr 1 10:16:18 PDT 2010


I had for different reasons the same problem with CommonTokenStream
and ended up implementing my own stream. It is available under the name
XQTokenStream in my open source project xqgrammar at :

http://code.google.com/p/xqgrammar/.

Regards
Nikolay

On 04/01/2010 07:07 PM, Kumar, Amitesh wrote:
> Ive got ANTLR version 3.2 it doesn't seem to have UnbufferedTokenStream, there doesn't seem to be a newer version on the site
>
> Amitesh Kumar |CIB Integration | Business Infrastructure Technology | Standard Bank CIB International | Ground Floor, 20 Gresham Street, London, EC2V 7JE
> T: +44 [0]203 145 5575 | E: amitesh.kumar at standardbank.com
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
> Sent: 01 April 2010 16:55
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Parsing Large Files
>
> Just replace new CommonTokenStream(...).
>
> But you will need to fix your grammar before it will all work of course.
>
> Jim
>
>    
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Kumar, Amitesh
>> Sent: Thursday, April 01, 2010 8:52 AM
>> To: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] Parsing Large Files
>>
>> Great where is this UnbufferedTokenStream.
>>
>> Cheers Jim
>>
>>
>> Amitesh Kumar |CIB Integration | Business Infrastructure Technology |
>> Standard Bank CIB International | Ground Floor, 20 Gresham Street,
>> London, EC2V 7JE
>> T: +44 [0]203 145 5575 | E: amitesh.kumar at standardbank.com
>>
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Jim Idle
>> Sent: 01 April 2010 16:27
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] Parsing Large Files
>>
>> Actually, I think that if you use UnbufferedTokenStream(), that this
>> will pretty much do what you want already, but it is easy to derive
>> from one of the token streams, and add methods tah can discard
>> buffered tokens once you know you have dealt with them.
>>
>>
>> Also, if you have comma separated files, then it is usually easier to
>> use awk. Finally, your grammar has myriad lexical ambiguities and I am
>> afraid it is not going to work as you have written it. You cannot have
>> more than one lexer rule that matches the same text as the lexer is
>> not syntax directed, it just tokenizes what it sees.
>>
>> Jim
>>
>>      
>>> -----Original Message-----
>>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>>> bounces at antlr.org] On Behalf Of Marcin Rzeznicki
>>> Sent: Thursday, April 01, 2010 8:02 AM
>>> To: Kumar, Amitesh
>>> Cc: antlr-interest at antlr.org
>>> Subject: Re: [antlr-interest] Parsing Large Files
>>>
>>> On Thu, Apr 1, 2010 at 4:26 PM, Kumar, Amitesh
>>> <Amitesh.Kumar at standardbank.com>  wrote:
>>>        
>>>>          
>>>        
>>>> But my general issue is that not all my data is a simple CSV file
>>>>          
>>> some
>>>        
>>>> will be multi line records. Hence I didn't want to keep a record
>>>> of
>>>>          
>>> the
>>>        
>>>> tokens.
>>>> Any ideas . By the way thanks for your reply.
>>>>
>>>>          
>>> Hi
>>> You can easily implement your own TokenStream that is optimized for
>>> your use case eg. does not try to keep everything in one big array.
>>>        
>> If
>>      
>>> you explore this possibility, you will quickly discover that it is
>>> very easy thing to do and test. Hope it helps.
>>>
>>>
>>>        
>>>> Cheers
>>>> Kumaap0
>>>>          
>>>
>>> --
>>> Greetings
>>> Marcin Rzeźnicki
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-
>>>        
>> interest/your-
>>      
>>> email-address
>>>        
>>
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>
>>
>> **********************************************************************
>> *
>> ******
>> More information on Standard Bank is available at www.standardbank.com
>>
>> Everything in this email and any attachments relating to the official
>> business of Standard Bank Group Limited and any or all subsidiaries,
>> ("the Company"), is proprietary to the Company. It is confidential,
>> legally privileged and protected by relevant laws. The Company does
>> not own and endorse any other content.
>> Views and opinions are those of the sender unless clearly stated as
>> being that of the Company.
>>
>> The person or persons addressed in this email are the sole authorised
>> recipient. Please notify the sender immediately if it has
>> unintentionally, or inadvertently reached you and do not read,
>> disclose or use the content in any way and delete this e-mail from
>> your system.
>>
>> The Company cannot ensure that the integrity of this email has been
>> maintained nor that it is free of errors, virus, interception or
>> interference.
>> The sender therefore does not accept liability for any errors or
>> omissions in the contents of this message which arise as a result of
>> e-mail transmission.
>> If verification is required please request a hard-copy version. This
>> message is provided for informational purposes and should not be
>> construed as a solicitation or offer to buy or sell any securities or
>> related financial instruments.
>> **********************************************************************
>> *
>> ******
>>
>>
>> This message has been scanned for viruses by BlackSpider MailControl -
>> www.blackspider.com
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>      
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>    

-- 



More information about the antlr-interest mailing list