[antlr-interest] Tell ANTLR to ignore parsing errors?

Daniels, Troy (US SSA) troy.daniels at baesystems.com
Thu Sep 9 14:18:25 PDT 2010


 The naïve way to do that creates problems:

CREATE_USER: "CREATE USER";

Only one of the next three lines will match that, but you almost certainly want all of them to match.

CREATE USER
CREATE  USER
CREATE<tab>USER

You could write it as 

CREATE_USER: "CREATE" WS+ "USER";

Even so, when the user types "CREATE UZER" instead, this will give a lexer error, rather than a parser error.  Its much more difficult to provide a meaningful error message from the lexer, since it does not have the context that the parser does.

If there are situations where CREATE is a valid user-specified identifier, you can't handle that case if you have CREATE_USER as a lexer token.

Troy

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org 
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
> Sent: Thursday, September 09, 2010 1:06 PM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
> 
> No - don't make whitespace significant unless the language 
> absolutely makes you do so.
> 
> What you have to do is left factor:
> 
> create
>     : CREATE
>     (
>            cr_table
>         | cr_user
>         | cr_trigger
>     )
>  ;
> 
> cr_table
>  : TABLE .....
> 
> Jim
> 
> > -----Original Message-----
> > From: Andi Clemens [mailto:Andi.Clemens at gmx.net]
> > Sent: Thursday, September 09, 2010 9:57 AM
> > To: Jim Idle; antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
> > 
> > Ok, thanks for the answers.
> > 
> > One final question: Would it be better to have tokens like 
> "CREATE USER"
> > and "CREATE TABLE" in the lexer or doesn't this work anyway 
> because of 
> > the whitespace?
> > 
> > Andi
> > 
> > -------- Original-Nachricht --------
> > > Datum: Thu, 9 Sep 2010 08:26:59 -0700
> > > Von: "Jim Idle" <jimi at temporal-wave.com>
> > > An: antlr-interest at antlr.org
> > > Betreff: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
> > 
> > > When putting things in the parser, you have not enough 
> control over 
> > > the tokens both in terms of what they are named in code 
> generation 
> > > time (hence error messages are difficult, and producing a tree 
> > > parser is difficult), and you cannot see the potential 
> ambiguities 
> > > in your lexer. It just makes things more difficult for 
> no(IMO) advantage.
> > >
> > > If you have told the input stream to be case insensitive, 
> then I am 
> > > afraid that the problem is going to be with your grammar. 
> You will 
> > > have to single step though the code to find out why.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: Andi Clemens [mailto:Andi.Clemens at gmx.net]
> > > > Sent: Thursday, September 09, 2010 7:32 AM
> > > > To: Jim Idle; antlr-interest at antlr.org
> > > > Subject: Re: [antlr-interest] Tell ANTLR to ignore 
> parsing errors?
> > > >
> > > > Yes it is case insensitive. What is the difference if I 
> add "CREATE"
> > > > or
> > > similar to
> > > > the lexer?
> > > > Is it more reliable in detecting the right tokens?
> > > >
> > > > Andi
> > > >
> > > > -------- Original-Nachricht --------
> > > > > Datum: Thu, 9 Sep 2010 07:21:45 -0700
> > > > > Von: "Jim Idle" <jimi at temporal-wave.com>
> > > > > An: antlr-interest at antlr.org
> > > > > Betreff: Re: [antlr-interest] Tell ANTLR to ignore 
> parsing errors?
> > > >
> > > > > If you are getting errors it is because your grammar 
> is incorrect.
> > > > > Oracle SQL is a huge grammar to undertake and you 
> cannot 'hack' it.
> > > > > Your token in the parser (which you should move to the lexer 
> > > > > anyway
> > > and
> > > > not use 'LITERAL'
> > > > > in your parser code) is CREATEE but your input is create. Did 
> > > > > you tell the runtime to be case insensitive?
> > > > >
> > > > > Read the API or use antlr.markmail.org to see how to override 
> > > > > displayRecognitionError(). You cannot just ignore 
> errors though 
> > > > > because somehow you have to recover. You could just make them 
> > > > > silent and when the parser returns if the error count 
> is >0 then 
> > > > > ignore that
> > > source
> > > > or something.
> > > > >
> > > > > I will have a commercial version of Oracle SQOL and PLSQL 
> > > > > available before too long too.
> > > > >
> > > > > Jim
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: antlr-interest-bounces at antlr.org 
> [mailto:antlr-interest- 
> > > > > > bounces at antlr.org] On Behalf Of Andi Clemens
> > > > > > Sent: Thursday, September 09, 2010 5:45 AM
> > > > > > To: antlr-interest at antlr.org
> > > > > > Subject: [antlr-interest] Tell ANTLR to ignore 
> parsing errors?
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I use this ANTLR grammar file to parse Oracle 
> PL/SQL statements:
> > > > > > http://pastebin.com/uy0wZTax
> > > > > >
> > > > > > But some of the statements produce errors when I 
> try to parse 
> > > > > > them, for
> > > > > > example:
> > > > > >
> > > > > > "create user u1 identified by p1 account unlock;"
> > > > > >
> > > > > > I get the following error message:
> > > > > >
> > > > > > ==============================
> > > > > > statementString(1)  : error 3 : , at offset -1
> > > > > >     near [Index: 0 (Start: 141054912-Stop: 141054917) 
> > > > > > ='create',
> > > > > type<50> Line:
> > > > > > 1 LinePos:-1]
> > > > > >      : cannot match to any predicted input...
> > > > > > ==============================
> > > > > >
> > > > > > But why? The rule looks like this:
> > > > > > ============================== create_user_statement
> > > > > > 	:	'CREATE' 'USER' identifier 'INDENTIFIED' .*
> > > > > > 	;
> > > > > > ==============================
> > > > > >
> > > > > > Could the wildcard character be the problem?
> > > > > > Actually I just want to parse known Statements with my 
> > > > > > grammar, all unknown statements (parsing errors) 
> could be ignored.
> > > > > >
> > > > > > Can I tell ANTLR (for the C target) to ignore those error 
> > > > > > messages and
> > > > > just
> > > > > > return FALSE or something like that, so that I can decide 
> > > > > > wether to take
> > > > > an
> > > > > > appropiate action?
> > > > > >
> > > > > > I get a lot of those error messages, and to be honest, the 
> > > > > > error
> > > > > messages are
> > > > > > not helping me here. I can not see problems with 
> the grammar.
> > > > > > Unfortunately I'm not able to debug the grammar 
> with ANTLRworks.
> > > > > >
> > > > > > Can someone show me the error or tell me a way to disable 
> > > > > > those error messages in the ANTLR C target?
> > > > > >
> > > > > > Andi
> > > > > >
> > > > > > --
> > > > > > GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 
> für nur 19,99
> > > > > > Euro/mtl.!* http://portal.gmx.net/de/go/dsl
> > > > > >
> > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > > > Unsubscribe:
> > > > > > http://www.antlr.org/mailman/options/antlr-interest/your-
> > > > > > email-address
> > > > >
> > > > >
> > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > > Unsubscribe:
> > > > > 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-a
> > > > > dd
> > > > > ress
> > > >
> > > > --
> > > > GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 für nur 19,99
> > > > Euro/mtl.!* http://portal.gmx.net/de/go/dsl
> > >
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > > 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addre
> > > ss
> > 
> > --
> > Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
> > http://portal.gmx.net/de/go/sicherheitspaket
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


More information about the antlr-interest mailing list