[antlr-interest] Debugging doesn't work with grammar

Fri Jul 6 07:28:52 PDT 2007

On 7/6/07, Johannes Luber <jaluber at gmx.de> wrote:
> Thomas Brandon wrote:
> > On 7/6/07, Johannes Luber <jaluber at gmx.de> wrote:
> >> Hi!
> >>
> > Without the input you used I can't be sure, but it looks like a
> > problem with non-matched "s in actions. To avoid non-LL* issues the
> > grammar uses a fixed lookahead in the NESTED_ACTION rule, so upon
> > seeing a " it decides it must be an ACTION_STRING_LITERAL, if there is
> > no closing " it will swallow input until the end of the file. Looks
> > like it's not finding that matching " for some reason.
>
> I don't want post the grammar with the ACTION_STRING_LITERAL problem
> here. If you need it I can send it off-list. I've attached my other test
> grammar, though.
>
> >> Testing a Java version of the file with the same input as for the
> >> exception results, that the remote debugger doesn't connect with parser
> >> (mind you, I've used the normal debug menu here).
> > I was able to debug the attached grammar so not sure what's going on
> > there. Did you get any errors in the console? Is the code generated
> > correctly? Did you try remote debugging? Are you able to debug other
> > grammars?
>
> I didn't get any errors in the console. While looking again to be sure
> that I haven't overlooked them, I somehow managed to start debugging of
> the Java version. Not sure why it didn't work yesterday. But the
> attached grammar isn't still recognized correctly. I end with "root ->
> action -> MismatchedTokenException" and an entirely red input pane. What
> goes wrong here?
I was not able to replicate this error using your previously attached
grammar. I found one grammar error and a few tree building errors but
after fixing them it parsed the attached grammar without error.
The grammar error I found was in the 'rule' rule where I needed to
change 'ruleAction+' to 'ruleAction*'. The tree building errors were
in ruleScopeSpec where it didn't like the "'scope'" literal reference
in the rewrite, changing to "SCOPE" (the token name) fixed that. I
also had to change the 'id+' in the rewrite to 'id*' as the 'id+' is
optional. Finally I was getting errors in the '-> ^(atom ebnfSuffix?)'
rewrite in element as atom was returning empty trees, making the empty
rewrites in atom instead return a PLACEHOLDER node fixed those. Then
the attached grammar parsed ok. So not sure what's going on at your
end. That was with ANTLRWorks 1.0.2, so ANTLR 3.0, so if you are using
a newer build you may want to test in the 3.0 release.
> Regarding remote debugging: I tried it with the C# version wihtout much
> success, but not with the Java version, as I don't have parser for that.
> I've tested Java.g with the same result as with the attached grammar.
> But using the debug option caused it to be truncated. My non-attached
> grammar was also truncated, but resulted in a different parse:
>
> "               -> MismatchedTokenException
>  root -> action -> actionScopeName -> parser
>                 -> MismatchedTokenException"
>
> Looking at this I'm not sure if the culprit is the syntactic predicate
> code, as you suggested.
Sorry, not sure I'm following you here. You mean parsing Java.g from
the ANTLR distribution with your ANTLR3ToRelaxNG grammar resulted in
the same error as parsing BackslashBugTest below? And to test that you
used ANTLRWorks debug option which had to truncate the input?
>
> Best regards,
> Johannes Luber
>

Tom.
>
> grammar BackslashBugTest;
>
> data:   CHARACTER*;
>
> CHARACTER
>         :       SINGLE_CHARACTER
>         |       SIMPLE_ESCAPE_SEQUENCE
>         ;
>
> fragment SINGLE_CHARACTER
>         :       ~('\'' | '\\' | NEW_LINE_CHARACTER )
>         ;
>
> fragment SIMPLE_ESCAPE_SEQUENCE
>         :       '\\\''
>         |       '\\\"'
>         |       '\\\\'
>         |       '\\0'
>         |       '\\a'
>         |       '\\b'
>         |       '\\f'
>         |       '\\n'
>         |       '\\r'
>         |       '\\t'
>         |       '\\v'
>         ;
>
> NEW_LINE_CHARACTER
>         :       '\u000D' // Carriage return character
>         |       '\u000A' // Line feed character
>         |       '\u0085' // Next line character
>         |       '\u2028' // Line separator character
>         |       '\u2029' // Paragraph separator character
>         ;
>
>