[antlr-interest] multi-line chat messages
ian eyberg
ian at telematter.com
Wed Apr 15 11:29:41 PDT 2009
On Wed, Apr 15, 2009 at 10:26:58AM -0700, William H. Schultz wrote:
some thoughts on this problem:
1) I can't change the grammar -- if it's too ambiguous I'd have
to investigate other means of getting the information.
2) as you pointed out a newline character isn't reliable in this
case (which for every rule that I can think of in my grammar it
usually is);
the problem with allowing or not-allowing tokens in the rule is
the fact that it's a chat message generated by a human
so they can say pretty much anything they want; when I was playing
with Boost::Spirit our chat message rules essentially disallowed
any 'action' to appear as the first word after a 'r' or 'n'
the actions themselves are no more than say 20 different keywords
so it's not a huge performance hit to look for them
also I do have the current list of usernames allowed for these
actions before I start trying to figure out what action is what;
unfortunately since everything gets tokenized BEFORE that logic
happens it doesn't really help me out
as for looking at tokens at the beginning the general action
looks like this:
username verb
our chat rule plays out like:
username: verb
thankfully I haven't come across a username with a ':' in it yet
but usernames can contain spaces which also present a problem of having
trailing whitespace
Thanks for the help,
Ian
>
> I'll warn that I haven't looked at the attached files, but I can
> imagine all hell breaking loose with input like this:
>
> sally sue slaps userX with a big fish
> userX: I really like seeing when it says
> sally sue slaps userX with a big fish
> sally sue slaps userX with a big fish
>
>
> It's completely ambiguous what the last two lines really are. We
> might interpret that the first one is a comment and that the second is
> information, but there's nothing in the grammar to clarify the
> uncertainty. Instead of looking for a newline character at the end,
> you could try looking for a token at the beginning. For example:
> ">>" As long as the token is something that isn't _allowed_ in user
> input, you'll be fine.
>
> -------------------------------
> Hank Schultz
> Cedrus Corporation
> http://www.cedrus.com/
--
ian eyberg
More information about the antlr-interest
mailing list