[antlr-interest] Capturing a grammar block as a string
Jared Bunting
jared.bunting at peachjean.com
Mon Nov 24 11:33:58 PST 2008
I sent this to Ilya earlier on accident - meant to send to the whole
list. On that note, is it possible to set the Reply-To header in list
emails? Or is there some reason not to do this?
Thanks,
Jared
Jared Bunting wrote:
> If you look at the antlr grammar itself, it does something like this.
> Using that as a starting point, I was able to accomplish something
> like this:
>
> ACTION
> : '{' NESTED_ACTION '}'
> ;
>
> fragment
> NESTED_ACTION
> :
> ( STRING
> | SSTRING
> | ~('{'|'}'|'\"'|'\'')
> | ACTION
> )*
> ;
>
> STRING and SSTRING are simply definitions of a double quoted and
> single quoted string respectively. These are included in order to
> prevent a '}' inside a string from being interpreted as the closing
> bracket. The reason for nesting ACTION is again to allow the '{}' to
> be nested. Something else you might want to look for is comments -
> somewhere else that braces might get included. Basically, just
> consider any situation in which the closing bracket would be legal in
> the json code, and make sure it doesn't get treated as the closing
> brace for the whole code block.
>
> It seems that the important aspect here is to treat your code block as
> a lexer rule, rather than a parser rule - this way you don't have the
> lexer trying to tokenize your json code, or the parser attempting to
> parse it.
>
> -Jared
>
> On Sun, Nov 23, 2008 at 6:22 PM, Ilya Sterin <sterini at gmail.com
> <mailto:sterini at gmail.com>> wrote:
>
> So in my grammar, I capture blocks of JSON-like structures. I don't
> want antlr to try to parse that structure, but rather to evaluate it
> as a string.
>
> Here is a sample code...
>
> define project as {
> "name": "some_widget",
> "version": "0.01-alpha"
> }
>
>
> Here is a simple grammar rule to demonstrate the issue I'm having...
>
>
> definesomething
> : define IDENT as json
> ;
>
> json
> : '{' .* '}'
> ;
>
> IDENT
> : ('0'..'9'|'a'..'z'|'A'..'Z'|'_')+;
>
> WHITESPACE
> : ( '\t' | ' ' | '\r' | '\n' | '\u000C' )+ { $channel=HIDDEN; } ;
>
>
> This is fact tries to evaluate the content between the braces. I
> actually would like the rule to evaluate all the content as one
> string, though I will later parse it within my application. Is there
> a way I can accomplish this?
>
> Thanks.
>
> Ilya
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
More information about the antlr-interest
mailing list