[antlr-interest] merry xmas and more

Terence Parr parrt at cs.usfca.edu
Sat Dec 25 13:43:47 PST 2004


Howdy and happy holidays to y'all. :)

Just a note to say I have finished the basic automatic error recovery 
stuff for v3.0.  Tosses in stuff like I'm doing now for 2.7.5:

         catch (RecognitionException re) {
             reportError(re);
             recover(FOLLOW_atom);
         }

and then with support sets (with better names, as you can see) at the 
bottom of the file:

     public static final BitSet FOLLOW_stat = new BitSet(new 
long[]{41346L});

Note that the rule invocation stack is available to you if you want it 
for error messages or context-sensitive error recovery (thanks to Java 
1.4.2's stack trace access).  For example, here is the crappy but 
info-laden error reporting:

[stat, stat, expr, mexpr, atom]: line 5  atom : ( INT | ID | '(' expr 
')' | "maxint" ); state 0 (decision=5) no viable alt; token=[;/<5>,4:8]

It prints stack trace, the grammar location, state/decision information 
and current token.  I have isolated the error reporting mechanism from 
the parser so that strings are not computed and passed around.  Now, 
the exception objects just track info and reportError can compute error 
messages in whatever language. [ANTLR itself will automatically be 
generating localized error messages if an error template file exists 
for your language].

Next step is to make the tool a little more robust.  Then DFA 
optimization.  Then tree construction.   Then tree parsers. :)

Hooray!  I'll be damned if this suckers isn't starting to look like a 
parser generator. :)

Ter
PS	runtime package has only about 1000 lines of java code :)
PPS for those interested, i include the comment / class def for 
RecognitionException here:

/** The root of the ANTLR exception hierarchy.
  *
  *  To avoid English-only error messages and to generally make things
  *  as flexible as possible, these exceptions are not created with 
strings,
  *  but rather the information necessary to generate an error.  Then
  *  the various reporting methods in Parser and Lexer can be overridden
  *  to generate a localized error message.  For example, MismatchedToken
  *  exceptions are built with the expected token types.
  *  So, don't expect getMessage() to return anything.
  *
  *  Note that as of Java 1.4, you can access the stack trace, which 
means
  *  that you can compute the complete trace of rules from the start 
symbol.
  *  This gives you considerable context information with which to 
generate
  *  useful error messages.
  *
  *  ANTLR generates code that throws exceptions upon recognition error 
and
  *  also generates code to catch these exceptions in each rule.  If you
  *  want to quit upon first error, you can turn off the automatic error
  *  handling mechanism.  If you want an ANTLR-generated recognizer to 
bail
  *  out after another kind of error, then just throw an exception not
  *  under this hierarchy.
  *
  *  In general, the recognition exceptions track where in a grammar an
  *  problem occurred and/or what was the expected input.  The parser
  *  knows its state (such as current input symbol and line info) so
  *  this information is left out of the exception and left to the 
reporting
  *  method(s) to fill in.  You might want to have an error report an
  *  entire line of input not just a single token, for example.  Better 
to
  *  just say the recognizer had a problem and then let the parser report
  *  where the heck it was.
  */
public class RecognitionException extends Exception {
}
--
CS Professor & Grad Director, University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
Cofounder, http://www.knowspam.net enjoy email again!





More information about the antlr-interest mailing list