[antlr-interest] Matching an arbitrary string until the next whitespace occurrence

Patrick Nick peedee.nick at gmail.com
Thu Oct 20 04:57:32 PDT 2011


It makes sense, but I don't want to use getargs. To be more precise, I am
already using that, and only one of my arguments needs further parsing and
that's what I need my grammar for. This one argument is a tree of
parenthesized subexpressions, for example "(A AND B) OR ((C OR D) AND E)".
Now every one of those letters should contain one of a predefined set of
keywords, followed by an arbitrary string.

I could delimit that arbitrary string by some designated character like for
example double quotes, but the problem of matching everything in between
those doublequotes (no escape characters for now) remains. How should I go
about that? Is it possible with normal lexer and parser rules?

Thanks
Patrick


On Wed, Oct 19, 2011 at 6:16 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> I suggest that you want to use getargs, and not try to parse things like
> this with ANTLR as the specification is too vague. The lexer is not
> context driven, so placing a rule like that in lexer will match everything
> that is whitespace to the detriment of any other rule at all.
>
>
> Jim
>
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Patrick Nick
> > Sent: Wednesday, October 19, 2011 8:52 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] Matching an arbitrary string until the next
> > whitespace occurrence
> >
> > Hi all,
> >
> > I just started using antlr and was able to construct a nice grammar
> > that fulfills my application's needs. There is one thing which I
> > haven't been able to get to work though.
> > My grammar is parsing program arguments which the user supplied when
> > starting the program, and some of that input will need to be forwarded
> > to another program. This implies that I have almost no control over
> > those strings and need to be able to parse them only knowing that they
> > are delimited by whitespace. So what I need (I think) is a lexer rule
> > to match an arbitrary string which doesn't contain whitespace.
> >
> > Here is what I tried, with the intention that it should match anything
> > that doesn't contain one of the four characters.
> >
> > STRING :    (~(' '|'\t'|'\r'|'\n'))+ ;
> >
> > However, that does not seem to be working, it doesn't recognize numbers
> > for example, and I don't understand why.
> > Any hints?
> >
> > Regards
> > Patrick
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list