[antlr-interest] Problem parsing double quotes

Haralambi Haralambiev hharalambiev at gmail.com
Fri Apr 18 00:28:32 PDT 2008


Hi Mike,

If I have understood
correctly, the input text that you are giving to the lexer contains
the "'", ";" "{" and
"}"
characters, but does not know how to recognize them (as you do not
have a lexer rule for them).

If you do
not want the parser to take into account those tokens, you could add
the following lexer rule:
NotUsed : '\'' | ';' | '{' | '}' {$channel=HIDDEN;};

Another way is to add the tokens to the specific parser rules. For example
(adding the token ';'):

createTrigger
    :   ('CREATE' | 'REPLACE' ) 'TRIGGER' 'ON'
            'TABLE' objectName ('BEFORE' | 'AFTER') tableOperation
            'AS' triggerBody 'ENDTRIGGER' *';'*
    ;

Best Regards,
Hari

On 4/18/08, Mike Arace <mikearace at hotmail.com> wrote:
>
> Hi Gavin,
>
> Thanks for the response.
>
> I do not specify any lexer rules in my grammar.  When I reactivated the
> console (I had turned it off a while back... thanks for the reminder) I see
> the following output:
>
> line 2:20 mismatched character '"' expecting '<EOF>'
> line 2:45 mismatched character '"' expecting '<EOF>'
> line 2:47 mismatched character ';' expecting '<EOF>'
> line 3:18 mismatched character '{' expecting '<EOF>'
> line 4:17 mismatched character '"' expecting '<EOF>'
> line 4:81 mismatched character '"' expecting '<EOF>'
> line 4:83 mismatched character ';' expecting '<EOF>'
> line 5:0 mismatched character '}' expecting '<EOF>'
>
> Given that those are the exact characters that seem to be missing, that is
> a positive development, although I still don't know why.
>
> When I step through the program, the dropped tokens seem to occur after
> the Lexer has been initialized correctly, when the CommonTokenStream
> attempts to create its "tokens" ArrayLIst.
>
> Here is the guts of the grammar to get the output:
>
> -- Grammar below --
>
> createTrigger
>     :   ('CREATE' | 'REPLACE' ) 'TRIGGER' 'ON'
>             'TABLE' objectName ('BEFORE' | 'AFTER') tableOperation
>             'AS' triggerBody 'ENDTRIGGER';
>
> objectName
>     :    ObjectName;
>
> tableOperation
>     :    'INSERT' | 'UPDATE' | 'DELETE';
>
> triggerBody
>     :    (~('ENDTRIGGER'))*
>
> ObjectName
>     :    ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' )*;
>
> WS
>     :    ( ' ' | '\t' | '\n' | '\r' )+;
>
> -- Grammar above --
>
> Thanks for the assistance,
> Mike
>
> > Date: Thu, 17 Apr 2008 21:10:26 +1200
> > To: mikearace at hotmail.com; antlr-interest at antlr.org
> > From: antlr at mirality.co.nz
> > Subject: Re: [antlr-interest] Problem parsing double quotes
> >
> > At 18:39 17/04/2008, Mike Arace wrote:
> > >I am working on a grammar that seems to have run into a snag. It
> > >seems as if the Antlr lexer or parser is eating my doulble quotes
> > >and adjacent characters.
> > [...]
> > >Does anyone have any idea as to what can be happening?
> >
> > Do you have any lexer rules that refer to quote marks?
> >
> > Do you see any error output in the console when you run it?
> >
> > Can you post a complete minimal-reproduction grammar?
> >
>
> ------------------------------
> Pack up or back up–use SkyDrive to transfer files or keep extra copies. Learn
> how.<http://www.windowslive.com/skydrive/overview.html?ocid=TXT_TAGLM_WL_Refresh_skydrive_packup_042008>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080418/892ba9c6/attachment.html 


More information about the antlr-interest mailing list