[antlr-interest] gUnit StringIndexOutOfBoundsException, getText() called where state.tokenStartCharIndex is -1

Fri Aug 13 07:47:58 PDT 2010

As far as I can tell Lexer.nextToken is never called before parsing
the string. Thus the state is not set. If however I use a string in a
higher level rule then the lexer itself calls nextToken and everything
is fine.

I am mostly convinced that this is a problem in gUnit but I'm not
sure. I also really would like to have my tests working rather that
returning an Exception. I'm just weird that way ;)

mvg,
Jasper

On Wed, Aug 11, 2010 at 5:59 PM, Jasper Floor <jasperfloor at gmail.com> wrote:
> I have a DSL which is already in production but I am working on an
> updated version.
>
> We already have quite a few tests which call the parser but I never
> added gUnit tests until this version.
>
> Actually everything was working fine until my latest change where I
> started messing with the token stream. I don't think I am doing
> anything complicated. Also our other tests work fine so I think the
> problem is in gUnit.
>
> Basically I have a rule where I want to remove some elements from the
> input string. This is done with setText(unescape(getText())).
>
> The problem occurs in getText(). The state.tokenStartCharIndex is -1
> and this is passed to substring which is called in getText(). This
> leads to a StringIndexOutOfBoundsException.
>
> I've included the grammar. There is also a Tree Grammar after this but
> that isn't relevant. There are probably more problems with the grammar
> but at this point I'm only interested in the gUnit errors.
> I've changed some names to remove any company specific information but
> the grammar itself is unchanged.
>
> mvg,
> Jasper
>
>
> p.s. is it bad etiquette to post the whole grammar and tree grammar to
> ask for comments?
>