[antlr-interest] Parser help with grabbing unparsed code blocks

Wed Mar 15 12:01:44 PST 2006

Right off the top, can you change from { CAT, DOG } to ( CAT, DOG ) for the
parameters?  You now have a situation where the lexer is stateful because "{"
means something in one place, but "{" means ANYTHING in another place.

If you can't change the syntax, it would seem that you must use lexer state; at
the very least, track the last token, then match ANYTHING only when the last
token was "}" and therefore "{" means ANYTHING.

Perhaps another way is to have ANYTHING match from "}" (after DOG) to "}"; it's
unambiguous in the grammar as you've sketched it....

On 3/15/06, Llew Mason <llewmason at yahoo.com> wrote:
> Hi all,
>
> I'm trying to write a parser/lexer to deal with a language that
> contains code blocks that will not be interpreted by the parser, but
> I want the parser to extract them as chunks of text.
>
> For example, here's a dummy piece of code to be parsed:
>
> COMMAND {CAT, DOG}
> {
>     if (id.call() == true)
>     {
>         id.otherCall();
>     }
> }
>
> I want the parser to understand the tokens COMMAND { CAT , DOG } and
> parse those, and then expect a code block in curly braces.  However,
> it shouldn't attempt to parse the contents of the code block.  The
> action for the command rule needs to pull the entire contents of the
> curly braces (because I want to pass them onto beanshell as code).