[antlr-interest] Re: Help with Java grammar

cliftonccraig ccc at icsaward.com
Tue Mar 9 08:25:17 PST 2004


Thanks Ric,

Still no luck. I tried your suggestion and while it went through the
ANTLR generator ok it didnt stop the OutOfMemory error. It appears my
parser is still getting hung up on the last line being a single
comment. I'm trying another ugly solution to my problem that is not
ANTLR related and maybe someone can help me here. I am trying to
append a newline character to each stream fed into my parser but I'm
not satisfied with what I have. I created an AppendInputStream that
extends a filtered stream. I need this to perform as fast as possible
and I didn't find anything in the JDK that looked like it would help.
Basically what I want to do is wrap the FileInputStream that I give my
parser in my custom AppendInputStream and hand it off like so:
Reader r = new InputStreamReader( new AppendInputStream( new
FileInputStream(javaFile), "\r\n" ) )

It sounds like overkill but I just don't see any easier way to do
this. I thought I saw a way to chain or concatenate to InputStreams
together as one in some book or article but I can't remember. Please,
could someone help me with either solution. I feel so stumped.

Cliff

The source to my AppendInputStream is below:
    private static class AppendInputStream extends FilterInputStream
    {
        InputStream additional;
        private static final int EOF = -1;
        private boolean endOfFirst = false, firstMarked = false,
secondMarked = false;
        private int markCount = 0, markLimit = 0;

        public AppendInputStream(InputStream in, String additional)
        {
            this(in,additional.getBytes());
        }

        protected AppendInputStream(InputStream in, byte[] add)
        {
            this(in, new ByteArrayInputStream(add));
        }

        protected AppendInputStream(InputStream in, InputStream add)
        {
            super(in);
            additional = add;
        }

        public int read() throws IOException
        {
            int val = EOF;
            if (false==endOfFirst)
            {
                val = super.read();
            }
            if (EOF != val)
            {
                trackMarking(1);
                return val;
            }
            else
            {
                endOfFirst = true;
                conditionallyMark();
                return additional.read();
            }
        }

        public int read(byte b[]) throws IOException
        {
            return read(b, 0, b.length);
        }

        public int read(byte b[], int off, int len) throws IOException
        {
            if(off + len > b.length)
                throw new IndexOutOfBoundsException("Cannot read " +
len + " bytes from offset " + off + " in array of length " +
b.length);
            int val = EOF;
            if (false==endOfFirst)
            {
                val = super.read(b, off, len);
                trackMarking(val);
            }
            if (EOF != val)
            {
                if( len > val )
                {
                    endOfFirst = true;
                    val += readAdditional(b, val + off, len - val);
                }
            }
            else
            {
                endOfFirst = true;
                val = readAdditional(b, 0, b.length);
            }
            return val;
        }

        /**
         * Keep track of the # of bytes read into our marking.
         * @param val
         */
        private void trackMarking(int val)
        {
            if(firstMarked) markCount += val;
        }

        private int readAdditional(byte[] b, int off, int len) throws
IOException
        {
            conditionallyMark();
            return additional.read(b, off, len);
        }

        private void conditionallyMark()
        {
            //Just-in-time marking. If the 1st is marked but the 2nd
hasn't been marked
            //and have haven't read past our mark limit we mark it
right before our 1st
            //attempt to read into it.
            if(firstMarked && false==secondMarked && markLimit -
markCount > 0)
            {
                additional.mark(markLimit - markCount);
                secondMarked = true;
            }
        }

        public int available() throws IOException
        {
            if (endOfFirst)
            {
                return additional.available();
            }
            else
            {
                return super.available();
            }
        }

        public void close() throws IOException
        {
            super.close();
            additional.close();
        }

        public synchronized void mark(int readlimit)
        {
            markLimit = readlimit;
            if(endOfFirst && false==firstMarked &&
false==secondMarked)
            {
                additional.mark(readlimit);
                secondMarked = true;
            }
            else
            {
                super.mark(readlimit);
                firstMarked = true;
            }
        }

        public boolean markSupported()
        {
            return super.markSupported() && additi
onal.markSupported();
        }

        public synchronized void reset() throws IOException
        {
            if(firstMarked)
            {
                super.reset();
                firstMarked = false;
            }
            markCount = 0; markLimit = 0;

            if (secondMarked)
            {
                additional.reset();
            }
        }

        public long skip(long n) throws IOException
        {
            if(endOfFirst)
                return additional.skip(n);
            else
                return super.skip(n);
        }
    }


--- In antlr-interest at yahoogroups.com, Ric Klaren <klaren at c...> wrote:
> On Tue, Mar 09, 2004 at 02:08:09PM -0000, cliftonccraig wrote:
> > I just tried this:
> > SL_COMMENT
> > 	:	"//"
> > 		(~('\n'|'\r'))* ('\n'|'\r'('\n')?)
> > 		{
> > //*CCC- Allow comments to flow through to the rewrite engine
> > //		    $setType(Token.SKIP);
> > 		    newline();
> > 		}
> > 		|
> > 		"//" (~('\n'|'\r'))*
> > 	;
> 
> How about this? 
> 
> SL_COMMENT
>    :  "//" 
>       ( ~('\n'|'\r') )*                         // not a newline
part...
>       ( ('\n'|'\r'('\n')? { newline(); } ) )?   // optional newline
>    ;
> 
> If this gives trouble generate the lexer with -traceLexer and see
where it
> gets stuck. (or check with debugger)
> 
> There's a few dirty tricks you can do with EOF checks that work by
checking
> LA(i) for EOF in the init action of a closure rule, but I don't
think these
> should be necessary for this. (unless I'm missing the point
somewhere)
> 
> > And I got an warning saying:
> > D:\scm\tools\parsers\grammar\ANTLR\java.g:1235: warning:lexical
> > nondeterminism between alts 1 and 2 of block upon
> > D:\scm\tools\parsers\grammar\ANTLR\java.g:1235:     k==1:'/'
> > D:\scm\tools\parsers\grammar\ANTLR\java.g:1235:     k==2:'/'
> > D:\scm\tools\parsers\grammar\ANTLR\java.g:1235:    
> > k==3:'\u0003'..'\t','\u000b','\u000c','\u000e'..'\uffff'
> > D:\scm\tools\parsers\grammar\ANTLR\java.g:1235:    
> >
k==4:<end-of-token>,'\u0003'..'\t','\u000b','\u000
c','\u000e'..'\uffff'
> 
> Don't worry too much about warnings like these ;) Read the source
for what
> antlr generated for the rule and it often becomes obvious if the
> parser/lexer will do the right thing. (and it helps in getting a
feel for
> things)
> 
> Cheers,
> 
> Ric
> -- 
>
-----+++++****************************************
*************+++++++++-------
>     ---- Ric Klaren ----- j.klaren at u... ----- +31 53 4893722  ----
>
-----+++++****************************************
*************+++++++++-------
>  Time what is time - I wish I knew how to tell You why - It hurts to
know -
>           Aren't we machines - Time what is time - Unlock the door
>                - And see the truth - Then time is time again
>                 From: 'Time what is Time' by Blind Guardian



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list