[antlr-interest] set max number of characters in a string literal

Olya Krachina okrachin at purdue.edu
Sat Sep 13 15:24:48 PDT 2008


I thought i could go off of this idea but i have a few issues; currently i have
this:

LONG_STRINGLITERAL
  : 
    ('"') (~('"'|'\n'|'\r'))* ('"')
    {
        String wholeStr;
        int strLen, maxStrLen = 80;
        wholeStr = getText().toString();
        strLen = wholeStr.length();
         if ( strLen > maxStrLen )
         {
          {$setType(STRINGLITERAL))};
         }
    }  
    ;

It looks like this might work, only now i have to declare STRINGLITERAL somehow,
i would have to write ANOTHER RULE for stringliteral, and i don't know what i
could write where the prefix does not match the LONGSTRINGLITERAL rule; 

my second idea is to maybe "unsetType"... is it possible? (I am new to antlr)
 
and third idea was to make LONG_STRINGLITERAL protected and do setType
conditionally, but then again i would have to declare STRINGLITERAL somehow.... 
in many examples online this seems to be possible without extra declarations or
rules, but i can never compile that code with antlr 2.7.

Any ideas?
thanks again.


Quoting "Edwards, Waverly" <Waverly.Edwards at genesys.com>:

> Untested but should work, assuming you want something that works after
> the fact.
> 
> 
> 
> W.
> 
> SOME_STRING   : YourStringRule
>     {
>         String wholeStr;
>         int strLen, maxStrLen = 20;
> 
>         wholeStr = getText().toString();
>         strLen = wholeStr.length();
>          if ( strLen > maxStrLen ) {
>             System.out.println( strLen + " > " + maxStrLen + ".
> Truncating...");
>             setText(wholeStr.substring(maxStrLen)); // text is now
> truncated
>          }
>     }; 
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Gavin Lambert
> Sent: Friday, September 12, 2008 8:01 AM
> To: Olya Krachina; antlr-interest at antlr.org
> Subject: Re: [antlr-interest] set max number of characters in a string
> literal
> 
> At 15:48 12/09/2008, Olya Krachina wrote:
>  >I am working on a lexer and i was wondering how i could set a 
> max
>  >limit on the number of characters that make up a string literal, 
> 
>  >i.e. it is valid when there are n (let's say n = 20) or less 
> chars.
>  >I tried setting lookahead to 20 (options k = 20) but it did not
>  >have any effect. I am using antlr 2.7.
> 
> If you need to parse exactly 20 chars and stop dead (eg. for 
> fixed-width data formats), you'll need to spell it out explicitly 
> -- eg. repeat CHAR 20 times.  (ANTLR only supports cardinalities 
> for zero, one, or many.)
> 
> Of course you can be a bit more clever about it, eg. making a rule 
> that contains five CHARs and then use that rule four times, etc.
> 
> 
> If your input language isn't actually ambiguous, though, and you 
> just want to do this for validation purposes, then your best bet 
> is to just successfully match however many characters happen to 
> appear (even if more than 20), and use a semantic action in the 
> parser or tree parser to validate the length after the fact.
> 
> Generally speaking the lexer should be built to be as tolerant as 
> it possibly can -- wait until parsing time to detect and report 
> errors.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
> ess
> 
> 




More information about the antlr-interest mailing list