[antlr-interest] Capturing a grammar block as a string

Jared Bunting jared.bunting at peachjean.com
Mon Nov 24 11:33:58 PST 2008


I sent this to Ilya earlier on accident - meant to send to the whole 
list.  On that note, is it possible to set the Reply-To header in list 
emails?  Or is there some reason not to do this?

Thanks,
Jared

Jared Bunting wrote:
> If you look at the antlr grammar itself, it does something like this.  
> Using that as a starting point, I was able to accomplish something 
> like this:
>
> ACTION
>     :    '{' NESTED_ACTION '}'
>     ;
>    
> fragment
> NESTED_ACTION
>     :   
>     (    STRING
>     |    SSTRING
>     |    ~('{'|'}'|'\"'|'\'')
>     |    ACTION
>     )*
>     ;
>
> STRING and SSTRING are simply definitions of a double quoted and 
> single quoted string respectively.  These are included in order to 
> prevent a '}' inside a string from being interpreted as the closing 
> bracket.  The reason for nesting ACTION is again to allow the '{}' to 
> be nested.  Something else you might want to look for is comments - 
> somewhere else that braces might get included.  Basically, just 
> consider any situation in which the closing bracket would be legal in 
> the json code, and make sure it doesn't get treated as the closing 
> brace for the whole code block.
>
> It seems that the important aspect here is to treat your code block as 
> a lexer rule, rather than a parser rule - this way you don't have the 
> lexer trying to tokenize your json code, or the parser attempting to 
> parse it.
>
> -Jared
>
> On Sun, Nov 23, 2008 at 6:22 PM, Ilya Sterin <sterini at gmail.com 
> <mailto:sterini at gmail.com>> wrote:
>
>     So in my grammar, I capture blocks of JSON-like structures.  I don't
>     want antlr to try to parse that structure, but rather to evaluate it
>     as a string.
>
>     Here is a sample code...
>
>     define project as {
>        "name": "some_widget",
>        "version": "0.01-alpha"
>     }
>
>
>     Here is a simple grammar rule to demonstrate the issue I'm having...
>
>
>     definesomething
>      :  define IDENT as json
>      ;
>
>     json
>      :  '{' .* '}'
>      ;
>
>     IDENT
>      :  ('0'..'9'|'a'..'z'|'A'..'Z'|'_')+;
>
>     WHITESPACE
>      :  ( '\t' | ' ' | '\r' | '\n' | '\u000C' )+  { $channel=HIDDEN; } ;
>
>
>     This is fact tries to evaluate the content between the braces.  I
>     actually would like the rule to evaluate all the content as one
>     string, though I will later parse it within my application.  Is there
>     a way I can accomplish this?
>
>     Thanks.
>
>     Ilya
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>


More information about the antlr-interest mailing list