[antlr-interest] Stopping parser and lexer at first error

Corrado Campisano corrado.campisano at gmail.com
Fri Apr 2 08:47:44 PDT 2010


Hi,

I think this could apply to lexel-level errors due to unexpected chars, but
not to unexpected char-sequences.

I mean (it's not the case of my grammar, but could happen), what if I want
to distinguish tokens like those:
 - MyClass
 - MYCONSTANT
 - myVariable
and consider the following ones as errors:
 - MyCLass
 - MYCONstant
 - myVAriable

??

Is the "you should" from some best-practice?

I believe the lexer should rise exceptions due to errors in the 'lexical
analisys' and the parser for the 'syntactic analisys', am I wrong?


[image: http://wiki.codeblocks.org/images/a/a9/Parser_Flow.gif]


Regards,
Corrado


2010/4/2 Jim Idle <jimi at temporal-wave.com>

> You should program your lexer such that it does not throw any errors.
> Program for the common mistakes (such as un-terminated "string) and have a
> catch all rule for unknown characters.
>
> Jim
>
>
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Corrado Campisano
> > Sent: Friday, April 02, 2010 7:59 AM
> > To: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Stopping parser and lexer at first error
> >
> > Hi all,
> >
> > I setup an ANTLR-maven archetype with a grammar providing the override
> > for
> > the 'always resume' behaviour.
> >
> > You can find details on my
> > website<http://www.servicemix.eu/index.php?option=com_content&view=arti
> > cle&id=14>,
> > maybe it's worth checking it and adding a notice on this archetype to
> > this
> > ANTLR wiki
> > page<http://www.antlr.org/wiki/display/ANTLR3/Building+ANTLR+Projects+w
> > ith+Maven>and/or
> > to the ANTLR
> > v3 Maven plugin page <http://www.antlr.org/antlr3-maven-
> > plugin/index.html>.
> >
> >
> > Regards,
> > Corrado.
> >
> >
> > 2010/3/10 Corrado Campisano <corrado.campisano at gmail.com>
> >
> > > Hi all,
> > >
> > > I needed to catch any syntax error (letting the lexer insert/delete
> > chars
> > > or the parser keeping parsing with the sys.err message only could be
> > very
> > > dangerous to my application), so I took a look on the reference
> > (which
> > > reports information not valid anymore) and on the internet and I
> > found
> > > several hints and articles:
> > >
> > > Why the generated parser code tolerates illegal
> > expression?<http://www.antlr.org/wiki/pages/viewpage.action?pageId=4554
> > 943>
> > > How can I make the lexer exit upon first lexical
> > error?<http://www.antlr.org/wiki/pages/viewpage.action?pageId=5341217>
> > > http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery
> > > [antlr-interest] I want to throw an exception and stop parse, please!
> > > <http://www.antlr.org/pipermail/antlr-interest/2009-May/034605.html>
> > >
> > > It looks to me I found a way to do this, maybe it's worth to publish
> > that
> > > on the wiki, once validated.
> > >
> > >
> > > I just added the following overrides to my grammar (attached):
> > >
> > > @parser::members
> > > {
> > >     public class ParserException extends RuntimeException {
> > >             Object objCurrentInputSymbol = null;
> > >
> > >             public ParserException(Object oCurrentInputSymbol) {
> > >                 this.objCurrentInputSymbol = oCurrentInputSymbol;
> > >             }
> > >         }
> > >
> > >         protected Object recoverFromMismatchedToken(IntStream input,
> > int
> > > ttype, BitSet follow) throws RecognitionException {
> > >             System.out.println("PARSER :
> > > this.getCurrentInputSymbol(input).toString() : " +
> > > this.getCurrentInputSymbol(input).toString());
> > >             System.out.println("PARSER : this.failed() : " +
> > > this.failed());
> > >             System.out.println("PARSER :
> > this.getNumberOfSyntaxErrors() : "
> > > + this.getNumberOfSyntaxErrors());
> > >             throw new
> > ParserException(this.getCurrentInputSymbol(input));
> > >         }
> > > }
> > >
> > > @lexer::members
> > > {
> > >     public class LexerException extends RuntimeException {
> > >             RecognitionException recognitionException = null;
> > >             String strErrorHeader = null;
> > >             String strErrorMessage = null;
> > >
> > >             public LexerException(RecognitionException recExc, String
> > > sHead, String sMsg) {
> > >                 this.recognitionException = recExc;
> > >                 this.strErrorHeader = sHead;
> > >                 this.strErrorMessage = sMsg;
> > >
> > >                 System.out.println("LEXER : ErrorHeader : " + sHead);
> > >                 System.out.println("LEXER : ErrorMessage : " + sMsg);
> > >                 System.out.println("LEXER : RecognitionException : "
> > +
> > > this.recognitionException.toString());
> > >             }
> > >         }
> > >
> > >
> > >         public void reportError(RecognitionException recExc) {
> > >         throw new LexerException(recExc, this.getErrorHeader(recExc),
> > > getErrorMessage(recExc, this.getTokenNames()));
> > >     }
> > > }
> > >
> > >
> > > Then I tested it with a simple class:
> > >     public static void main(String[] args) {
> > >         testLexerError();
> > >         testParserError();
> > >     }
> > >     private static void testLexerError() {
> > >         String strDlToParse = "{CORRADO PIPPO ;feee}";
> > >         System.out.println("TESTING LEXER with : " + strDlToParse);
> > >         testError(strDlToParse);
> > >     }
> > >     private static void testParserError() {
> > >         String strDlToParse = "{CORRADO PIPPO feee} dhert";
> > >         System.out.println("TESTING PARSER with : " + strDlToParse);
> > >         testError(strDlToParse);
> > >     }
> > >     private static void testError(String strDlToParse) {
> > >         CommonTree tree=null;
> > >         String strError = null;
> > >
> > >         ANTLRStringStream input = new
> > > org.antlr.runtime.ANTLRStringStream(strDlToParse);
> > >         Dl2OwlJavaBLexer lexer = new Dl2OwlJavaBLexer(input);
> > >         TokenStream tokens = new
> > > org.antlr.runtime.CommonTokenStream(lexer);
> > >         Dl2OwlJavaBParser parser = new Dl2OwlJavaBParser(tokens);
> > >
> > >         try {
> > >             // this may rise an exception
> > >             // TODO : check why NO EXCEPTION is risen with error
> > "line 1:9
> > > no viable alternative at character ';'" on inputs like "{CORRADO ;}"
> > >             eu.servicemix.dl2owl.Dl2OwlJavaBParser.axiom_return ret =
> > > parser.axiom();
> > >
> > >             // TODO : check if this will be executed if no exception
> > rises
> > >             tree = (CommonTree) ret.getTree();
> > >
> > >             printTreeHelper(tree);
> > >
> > >         } catch (RecognitionException e) {
> > >
> > >             System.out.println(e.toString());
> > >             e.printStackTrace();
> > >
> > >         } catch (RuntimeException e) {
> > >
> > >             System.out.println(e.toString());
> > >             e.printStackTrace();
> > >         }
> > >     }
> > >
> > >
> > > The output looks ok, I wonder whether the whole 'trick' is too...
> > >
> > > TESTING LEXER with : {CORRADO PIPPO *;*feee}
> > > LEXER : ErrorHeader : line 1:15
> > > LEXER : ErrorMessage : no viable alternative at character ';'
> > > LEXER : RecognitionException : NoViableAltException(';'@[1:1: Tokens
> > : (
> > > T__37 | T__38 | T__39 | T__40 | HAS_VALUE | ALL_VALUES | SOME_VALUES
> > | DOT |
> > > HAS_CARD | MIN_CARD | MAX_CARD | NOT | AND | OR | URI_REF | INT_VALUE
> > | WS |
> > > CTRL_CHAR );])
> > > eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException
> > > eu.servicemix.dl2owl.Dl2OwlJavaBLexer$LexerException
> > >     at
> > >
> > eu.servicemix.dl2owl.Dl2OwlJavaBLexer.reportError(Dl2OwlJavaBLexer.java
> > :69)
> > >     at org.antlr.runtime.Lexer.nextToken(Lexer.java:94)
> > >     at
> > >
> > org.antlr.runtime.CommonTokenStream.fillBuffer(CommonTokenStream.java:1
> > 19)
> > >     at
> > org.antlr.runtime.CommonTokenStream.LT<http://org.antlr.runtime.commont
> > okenstream.lt/>
> > > (CommonTokenStream.java:238)
> > >     at
> > >
> > eu.servicemix.dl2owl.Dl2OwlJavaBParser.axiom(Dl2OwlJavaBParser.java:110
> > )
> > >     at
> > >
> > eu.servicemix.dl2owl.CommonTreeHelper.testError(CommonTreeHelper.java:1
> > 40)
> > >     at
> > >
> > eu.servicemix.dl2owl.CommonTreeHelper.testLexerError(CommonTreeHelper.j
> > ava:121)
> > >     at
> > > eu.servicemix.dl2owl.CommonTreeHelper.main(CommonTreeHelper.java:113)
> > >
> > > TESTING PARSER with : {CORRADO PIPPO feee} *dhert*
> > > PARSER : this.getCurrentInputSymbol(input).toString() :
> > > [@8,21:25='dhert',<7>,1:21]
> > > PARSER : this.failed() : false
> > > PARSER : this.getNumberOfSyntaxErrors() : 0
> > > eu.servicemix.dl2owl.Dl2OwlJavaBParser$ParserException
> > > eu.servicemix.dl2owl.Dl2OwlJavaBParser$ParserException
> > >     at
> > >
> > eu.servicemix.dl2owl.Dl2OwlJavaBParser.recoverFromMismatchedToken(Dl2Ow
> > lJavaBParser.java:97)
> > >     at
> > org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
> > >     at
> > >
> > eu.servicemix.dl2owl.Dl2OwlJavaBParser.axiom(Dl2OwlJavaBParser.java:232
> > )
> > >     at
> > >
> > eu.servicemix.dl2owl.CommonTreeHelper.testError(CommonTreeHelper.java:1
> > 40)
> > >     at
> > >
> > eu.servicemix.dl2owl.CommonTreeHelper.testParserError(CommonTreeHelper.
> > java:126)
> > >     at
> > > eu.servicemix.dl2owl.CommonTreeHelper.main(CommonTreeHelper.java:114)
> > >
> > >
> > > Any comment really appreciated!!
> > >
> > > Corrado
> > >
> > >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list