[antlr-interest] ArrayIndexOutOfBoundsException

Steve Bennett stevagewp at gmail.com
Sat Feb 2 02:45:11 PST 2008


On 2/2/08, Gavin Lambert <antlr at mirality.co.nz> wrote:
> I thought you said that when you removed the predicate the problem
> went away?

Right:

With predicate that returns true: succeeds
With no predicate at all: succeeds
With predicate that fails (including "false") : fails

>
> Anyway, to requote the original full exception you posted earlier:
>  >Exception in thread "main"
>  >java.lang.ArrayIndexOutOfBoundsException: -1
>  >      at org.antlr.runtime.DFA.predict(DFA.java:44)
>  >      at mediawiki1Parser.inline_text(mediawiki1Parser.java:13618)
>  >      at
> mediawiki1Parser.header_simple_text(mediawiki1Parser.java:16669)
>  >      at mediawiki1Parser.header3(mediawiki1Parser.java:5423)
>  >      at
> mediawiki1Parser.synpred19_fragment(mediawiki1Parser.java:19872)
>  >      at mediawiki1Parser.synpred19(mediawiki1Parser.java:20998)
>  >      at mediawiki1Parser.headerline(mediawiki1Parser.java:4238)
>  >      at
> mediawiki1Parser.synpred3_fragment(mediawiki1Parser.java:19604)
>  >      at mediawiki1Parser.synpred3(mediawiki1Parser.java:21158)
>  >      at mediawiki1Parser.line(mediawiki1Parser.java:1295)
>  >      at mediawiki1Parser.article(mediawiki1Parser.java:915)
>  >      at mediawiki1Parser.start(mediawiki1Parser.java:299)
>  >      at __Test__.main(__Test__.java:14)
>
> So, it's trying to parse a "line".  In doing that, it's calling
> ahead through a syntactic predicate which ends up involving
> "headerline", which in turn has another syntactic predicate
> leading to "header3", then "header_simple_text", and finally
> "inline_text".  Presumably the problem lies somewhere in that
> chain of rules -- and not necessarily at the end.  (I know you've
> changed things since this particular exception, so you're getting
> it somewhere else now, but you can follow the same line of
> reasoning on the full trace of the exception you're getting
> now.)  The main thing is that you can tell from the trace which
> rules are "locked" and which ones are "speculative" based on
> whether they appear before or after a synpred call, and from that
> you might be able to tell whether it's going down the wrong path.

Hmm. From the parse tree I can see that it's still in speculative mode
(blue). Here's the simplest sentence which exhibits the problem:

[[a|b]]

What this is about is ] by itself is a literal square bracket, but in
certain contexts like the b, you can't have a ]. I use a flag to rule
it out. This sentence generates this following traceback:

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1
	at org.antlr.runtime.DFA.predict(DFA.java:44)
	at mediawiki3Parser.simple_inline_elem(mediawiki3Parser.java:15484)
	at mediawiki3Parser.synpred38_fragment(mediawiki3Parser.java:21941)
	at mediawiki3Parser.synpred38(mediawiki3Parser.java:22519)
	at mediawiki3Parser.simple_text(mediawiki3Parser.java:15077)
	at mediawiki3Parser.link_caption(mediawiki3Parser.java:7210)
	at mediawiki3Parser.internal_link(mediawiki3Parser.java:6964)
	at mediawiki3Parser.synpred33_fragment(mediawiki3Parser.java:21856)
	at mediawiki3Parser.synpred33(mediawiki3Parser.java:22455)
	at mediawiki3Parser.inline_text(mediawiki3Parser.java:14386)
	at mediawiki3Parser.paragraph(mediawiki3Parser.java:14168)
	at mediawiki3Parser.line(mediawiki3Parser.java:1565)
	at mediawiki3Parser.article(mediawiki3Parser.java:947)
	at mediawiki3Parser.start(mediawiki3Parser.java:330)
	at __Test__.main(__Test__.java:14)

So it's normal that it should enter simple_inline_element
speculatively for the ] it encounters, then it should fail, as ] is
not a valid "simple element" in this context. To do this, it calls a
few more rules, really_basic_element, punctuation, and finally
literal_right_bracket.

In other words, the traceback seems reasonable, and corresponds to the
graphical parser tree. But instead of then rejecting the
'simple_inline_elem' rule, colouring the node red and moving on, it
dies.

Frustratingly, when I try to look up the traceback in the generated
code, the line numbers don't seem to match. For example, synpred38 is
labelled as being on line 23214, but it should be 18939-18951. In the
generated code I do see the failed call to predict. Is there some
debugging code I could put in that might shed light? It's like this:

...
            // C:\\antlr\\mediawiki3.g:493:5: ( (
accidental_magic_link )=> accidental_magic_link | ( (
punctuation_before_nbsp )=> punctuation_before_nbsp ) | ( APOSTROPHES
)=> bold_and_italics | ( ( nbsp_before_punctuation )=>
nbsp_before_punctuation )+ | angle_tag | ( ( html_entity )=>
html_entity ) | really_basic_elem )
            int alt69=7;
            alt69 = dfa69.predict(input);
            switch (alt69) {
                case 1 :
...



> You actually made two changes here -- you changed the character it
> was matching and you changed it to a gated predicate.  Have you
> tried the original version but as a gated predicate instead?  (It
> looks like it ought to be a gated predicate anyway.)

Whoops, that was a typo in my email. I always use the "{pred }? =>
tokens" form, as the "{pred}? tokens" form tends to abort the pass if
it doesn't match.

Steve


More information about the antlr-interest mailing list