[antlr-interest] Unexpected CommonTokenStream.Size() result in CSharp runtime

Sam Harwell sharwell at pixelminegames.com
Fri Apr 17 16:17:27 PDT 2009


A few things to note:

The new unbuffered token stream class (in for version 3.2 I believe) throws a NotSupportedException if you try to get its size.

ITokenStream implements IIntStream, and you can't implement IEnumerable<IToken> and IEnumerable<int> on the same object with reliable results. I looked to the System.IO stream classes for reference, and found that streams are not enumerable. You should create a wrapper you want IEnumerable access to the tokens in the stream.

Sam

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Johannes Luber
Sent: Friday, April 17, 2009 6:06 PM
To: Chris Lambrou; antlr-interest at antlr.org
Subject: Re: [antlr-interest] Unexpected CommonTokenStream.Size() resultin CSharp runtime

> Yesterday I was stung by some odd behaviour in CommonTokenStream, whereby
> I
> was trying to iterate over the token stream looking for tokens of a
> specific
> type. Since ITokenStream doesn't implement IEnumerable, it appears that
> the
> way to do this is as follows:
> 
> for (int i = 0; i < tokenStream.Size(); i++)
> {
>     IToken token = tokenStream.Get(i);
> 
>     //... do stuff with token...
> }
> 
> However, I was finding that tokenStream.Size() returned 0, despite my
> token
> stream being non-empty. It seems that the underlying stream is lazily
> populated internally, and CommonTokenStream.Size() doesn't trigger a load.
> I
> had to invoke tokenStream.LT(0) to trigger the lazy load prior to looping
> through the tokens. Is this intended behaviour? Does is happen in all of
> the
> different runtimes? If so, it's very counterintuitive.

I believe that this is the general behavior. While a load seems to be sensible here, please be aware that in ANTLR 3.2 the current CommonTokenStream will renamed into BufferedTokenStream, as well providing a new implementation for CommonTokenStream without any buffering. This would result in giving back only the number of already scanned tokens (if the class does count this - didn't check it yet).

Quoting your other email: "FWIW, as a relative newcomer to ANTLR, it seems to me that either CommonTokenStream isn't correctly honouring the ITokenSteam interface, or else perhaps ITokenStream ought to be updated to formalise CommonTokenStream's behaviour (though I'm not familiar enough with the other ITokenStream sub-classes to be sure about this)."

Actually, Size() (which you should change to Count anyway for the newest runtimes) is defined in IIntStream. Looking at the comment I think a change in all runtimes is required.
> 
> Incidentally, could ITokenStream be updated to implement
> IEnumerable<IToken>
> please? Would others find this useful?

I don't know if there are classes (like the one for ASTs) which wouldn't work with IToken. Because my development machine is being repaired my capabilities are stark limited in checking things out. It may be more more sensible to turn ITokenStream itself generic. But changing interfaces breaks code depending on it directly. Not all people might use the provided implementing classes. In any case using generics requires the use of preprocessor symbols as the runtime has to remain .NET 1.1 compatible until 3.2.

> I don't mind doing the work, but to
> whom should I submit a patch?

Being the C# target maintainer I'm the right man for discussing things pertaining it, as well for patches, but currently my internet presence will be spotty due to not having an own computer with internet for now.

Johannes
> 
> Chris

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list