[antlr-interest] multi-line chat messages

ian eyberg ian at telematter.com
Wed Apr 15 11:29:41 PDT 2009


On Wed, Apr 15, 2009 at 10:26:58AM -0700, William H. Schultz wrote:

some thoughts on this problem:

  1) I can't change the grammar -- if it's too ambiguous I'd have
to investigate other means of getting the information.

  2) as you pointed out a newline character isn't reliable in this
case (which for every rule that I can think of in my grammar it
usually is); 

  the problem with allowing or not-allowing tokens in the rule is 
the fact that it's a chat message generated by a human
so they can say pretty much anything they want; when I was playing
with Boost::Spirit our chat message rules essentially disallowed
any 'action' to appear as the first word after a 'r' or 'n'

  the actions themselves are no more than say 20 different keywords
so it's not a huge performance hit to look for them

  also I do have the current list of usernames allowed for these
actions before I start trying to figure out what action is what;
unfortunately since everything gets tokenized BEFORE that logic 
happens it doesn't really help me out

  as for looking at tokens at the beginning the general action
looks like this:

  username verb

our chat rule plays out like:

  username: verb

thankfully I haven't come across a username with a ':' in it yet
but usernames can contain spaces which also present a problem of having
trailing whitespace

Thanks for the help,

Ian

> 
> I'll warn that I haven't looked at the attached files, but I can  
> imagine all hell breaking loose with input like this:
> 
> sally sue slaps userX with a big fish
> userX:  I really like seeing when it says
> sally sue slaps userX with a big fish
> sally sue slaps userX with a big fish
> 
> 
> It's completely ambiguous what the last two lines really are.  We  
> might interpret that the first one is a comment and that the second is  
> information, but there's nothing in the grammar to clarify the  
> uncertainty.  Instead of looking for a newline character at the end,  
> you could try looking for a token at the beginning.  For example:   
> ">>"  As long as the token is something that isn't _allowed_ in user  
> input, you'll be fine.
> 
> -------------------------------
> Hank Schultz
> Cedrus Corporation
> http://www.cedrus.com/

-- 
ian eyberg


More information about the antlr-interest mailing list