[antlr-interest] Tell ANTLR to ignore parsing errors?
Daniels, Troy (US SSA)
troy.daniels at baesystems.com
Thu Sep 9 14:18:25 PDT 2010
The naïve way to do that creates problems:
CREATE_USER: "CREATE USER";
Only one of the next three lines will match that, but you almost certainly want all of them to match.
CREATE USER
CREATE USER
CREATE<tab>USER
You could write it as
CREATE_USER: "CREATE" WS+ "USER";
Even so, when the user types "CREATE UZER" instead, this will give a lexer error, rather than a parser error. Its much more difficult to provide a meaningful error message from the lexer, since it does not have the context that the parser does.
If there are situations where CREATE is a valid user-specified identifier, you can't handle that case if you have CREATE_USER as a lexer token.
Troy
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
> Sent: Thursday, September 09, 2010 1:06 PM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
>
> No - don't make whitespace significant unless the language
> absolutely makes you do so.
>
> What you have to do is left factor:
>
> create
> : CREATE
> (
> cr_table
> | cr_user
> | cr_trigger
> )
> ;
>
> cr_table
> : TABLE .....
>
> Jim
>
> > -----Original Message-----
> > From: Andi Clemens [mailto:Andi.Clemens at gmx.net]
> > Sent: Thursday, September 09, 2010 9:57 AM
> > To: Jim Idle; antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
> >
> > Ok, thanks for the answers.
> >
> > One final question: Would it be better to have tokens like
> "CREATE USER"
> > and "CREATE TABLE" in the lexer or doesn't this work anyway
> because of
> > the whitespace?
> >
> > Andi
> >
> > -------- Original-Nachricht --------
> > > Datum: Thu, 9 Sep 2010 08:26:59 -0700
> > > Von: "Jim Idle" <jimi at temporal-wave.com>
> > > An: antlr-interest at antlr.org
> > > Betreff: Re: [antlr-interest] Tell ANTLR to ignore parsing errors?
> >
> > > When putting things in the parser, you have not enough
> control over
> > > the tokens both in terms of what they are named in code
> generation
> > > time (hence error messages are difficult, and producing a tree
> > > parser is difficult), and you cannot see the potential
> ambiguities
> > > in your lexer. It just makes things more difficult for
> no(IMO) advantage.
> > >
> > > If you have told the input stream to be case insensitive,
> then I am
> > > afraid that the problem is going to be with your grammar.
> You will
> > > have to single step though the code to find out why.
> > >
> > > Jim
> > >
> > > > -----Original Message-----
> > > > From: Andi Clemens [mailto:Andi.Clemens at gmx.net]
> > > > Sent: Thursday, September 09, 2010 7:32 AM
> > > > To: Jim Idle; antlr-interest at antlr.org
> > > > Subject: Re: [antlr-interest] Tell ANTLR to ignore
> parsing errors?
> > > >
> > > > Yes it is case insensitive. What is the difference if I
> add "CREATE"
> > > > or
> > > similar to
> > > > the lexer?
> > > > Is it more reliable in detecting the right tokens?
> > > >
> > > > Andi
> > > >
> > > > -------- Original-Nachricht --------
> > > > > Datum: Thu, 9 Sep 2010 07:21:45 -0700
> > > > > Von: "Jim Idle" <jimi at temporal-wave.com>
> > > > > An: antlr-interest at antlr.org
> > > > > Betreff: Re: [antlr-interest] Tell ANTLR to ignore
> parsing errors?
> > > >
> > > > > If you are getting errors it is because your grammar
> is incorrect.
> > > > > Oracle SQL is a huge grammar to undertake and you
> cannot 'hack' it.
> > > > > Your token in the parser (which you should move to the lexer
> > > > > anyway
> > > and
> > > > not use 'LITERAL'
> > > > > in your parser code) is CREATEE but your input is create. Did
> > > > > you tell the runtime to be case insensitive?
> > > > >
> > > > > Read the API or use antlr.markmail.org to see how to override
> > > > > displayRecognitionError(). You cannot just ignore
> errors though
> > > > > because somehow you have to recover. You could just make them
> > > > > silent and when the parser returns if the error count
> is >0 then
> > > > > ignore that
> > > source
> > > > or something.
> > > > >
> > > > > I will have a commercial version of Oracle SQOL and PLSQL
> > > > > available before too long too.
> > > > >
> > > > > Jim
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-
> > > > > > bounces at antlr.org] On Behalf Of Andi Clemens
> > > > > > Sent: Thursday, September 09, 2010 5:45 AM
> > > > > > To: antlr-interest at antlr.org
> > > > > > Subject: [antlr-interest] Tell ANTLR to ignore
> parsing errors?
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I use this ANTLR grammar file to parse Oracle
> PL/SQL statements:
> > > > > > http://pastebin.com/uy0wZTax
> > > > > >
> > > > > > But some of the statements produce errors when I
> try to parse
> > > > > > them, for
> > > > > > example:
> > > > > >
> > > > > > "create user u1 identified by p1 account unlock;"
> > > > > >
> > > > > > I get the following error message:
> > > > > >
> > > > > > ==============================
> > > > > > statementString(1) : error 3 : , at offset -1
> > > > > > near [Index: 0 (Start: 141054912-Stop: 141054917)
> > > > > > ='create',
> > > > > type<50> Line:
> > > > > > 1 LinePos:-1]
> > > > > > : cannot match to any predicted input...
> > > > > > ==============================
> > > > > >
> > > > > > But why? The rule looks like this:
> > > > > > ============================== create_user_statement
> > > > > > : 'CREATE' 'USER' identifier 'INDENTIFIED' .*
> > > > > > ;
> > > > > > ==============================
> > > > > >
> > > > > > Could the wildcard character be the problem?
> > > > > > Actually I just want to parse known Statements with my
> > > > > > grammar, all unknown statements (parsing errors)
> could be ignored.
> > > > > >
> > > > > > Can I tell ANTLR (for the C target) to ignore those error
> > > > > > messages and
> > > > > just
> > > > > > return FALSE or something like that, so that I can decide
> > > > > > wether to take
> > > > > an
> > > > > > appropiate action?
> > > > > >
> > > > > > I get a lot of those error messages, and to be honest, the
> > > > > > error
> > > > > messages are
> > > > > > not helping me here. I can not see problems with
> the grammar.
> > > > > > Unfortunately I'm not able to debug the grammar
> with ANTLRworks.
> > > > > >
> > > > > > Can someone show me the error or tell me a way to disable
> > > > > > those error messages in the ANTLR C target?
> > > > > >
> > > > > > Andi
> > > > > >
> > > > > > --
> > > > > > GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000
> für nur 19,99
> > > > > > Euro/mtl.!* http://portal.gmx.net/de/go/dsl
> > > > > >
> > > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > > > Unsubscribe:
> > > > > > http://www.antlr.org/mailman/options/antlr-interest/your-
> > > > > > email-address
> > > > >
> > > > >
> > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > > > Unsubscribe:
> > > > >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-a
> > > > > dd
> > > > > ress
> > > >
> > > > --
> > > > GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 für nur 19,99
> > > > Euro/mtl.!* http://portal.gmx.net/de/go/dsl
> > >
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addre
> > > ss
> >
> > --
> > Achtung Sicherheitswarnung: GMX warnt vor Phishing-Attacken!
> > http://portal.gmx.net/de/go/sicherheitspaket
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
More information about the antlr-interest
mailing list