[antlr-interest] Q: move from v2 to v3 parser grammar. Rewrite tree rule
Justin Murray
jmurray at aerotech.com
Wed Mar 23 10:26:15 PDT 2011
Jim,
I have a question regarding your comment on case insensitivity. I have
been using the "slowest" case insensitive lexer technique, as this is
the first I have seen a viable alternative (on the page that you linked
to). The grammar I am working with is a bit strange in that all of the
keywords in the language are case insensitive, but some rules, such as
variable names, are case sensitive. My question is, how far reaching is
the setUcaseLA() function (I am using the C target)? My variable name
rule accepts both uppercase and lowercase letters, and when I do
$tok.text->chars, I need to get the string in the original case that was
entered. So long as that is unaffected, I will be happy to get rid of
all of my "fragment A : ('A'|'a');" rules.
Thanks,
- Justin
On 3/22/2011 5:27 PM, Jim Idle wrote:
>> -----Original Message-----
>> From: Ruslan Zasukhin [mailto:ruslan_zasukhin at valentina-db.com]
>> Sent: Tuesday, March 22, 2011 2:21 PM
>
>>> However, using lower case literals in your parser directly is not a
>>> good idea. Use real tokens so that you error messages are better
>> Simple example, please?
> Instead of:
>
> rule : 'join' somerule;
>
> Use:
>
> rule : JOIN somerule;
>
> // Lexer rule to match:
> //
> JOIN : 'join';
>
> And for case insensitivity I specify the token specs all in UPPPER rather
> than lower and then override the input stream as per:
>
> http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782
>
> Although someone has added instructions for generating the slowest case
> insensitive lexers in the world with individual letter rules. Use the
> input stream override method in general.
>
>
>
>>
>>> and remember
>>> that SQL is generally case insensitive so you will need a [trivial]
>>> custom input stream.
>> Of course we do remember this :)
>>
>> And after grammar start to breath, we will yet work on
>> * case-insensitive of SQL text
>> * UTF-16 for input -- clarify ..
>
> UTF-16 input encoding is just a matter of telling the Java input stream to
> open the file in that encoding.
>
> Jim
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list