[antlr-interest] multi-line chat messages
    ian eyberg 
    ian at telematter.com
       
    Wed Apr 15 11:29:41 PDT 2009
    
    
  
On Wed, Apr 15, 2009 at 10:26:58AM -0700, William H. Schultz wrote:
some thoughts on this problem:
  1) I can't change the grammar -- if it's too ambiguous I'd have
to investigate other means of getting the information.
  2) as you pointed out a newline character isn't reliable in this
case (which for every rule that I can think of in my grammar it
usually is); 
  the problem with allowing or not-allowing tokens in the rule is 
the fact that it's a chat message generated by a human
so they can say pretty much anything they want; when I was playing
with Boost::Spirit our chat message rules essentially disallowed
any 'action' to appear as the first word after a 'r' or 'n'
  the actions themselves are no more than say 20 different keywords
so it's not a huge performance hit to look for them
  also I do have the current list of usernames allowed for these
actions before I start trying to figure out what action is what;
unfortunately since everything gets tokenized BEFORE that logic 
happens it doesn't really help me out
  as for looking at tokens at the beginning the general action
looks like this:
  username verb
our chat rule plays out like:
  username: verb
thankfully I haven't come across a username with a ':' in it yet
but usernames can contain spaces which also present a problem of having
trailing whitespace
Thanks for the help,
Ian
> 
> I'll warn that I haven't looked at the attached files, but I can  
> imagine all hell breaking loose with input like this:
> 
> sally sue slaps userX with a big fish
> userX:  I really like seeing when it says
> sally sue slaps userX with a big fish
> sally sue slaps userX with a big fish
> 
> 
> It's completely ambiguous what the last two lines really are.  We  
> might interpret that the first one is a comment and that the second is  
> information, but there's nothing in the grammar to clarify the  
> uncertainty.  Instead of looking for a newline character at the end,  
> you could try looking for a token at the beginning.  For example:   
> ">>"  As long as the token is something that isn't _allowed_ in user  
> input, you'll be fine.
> 
> -------------------------------
> Hank Schultz
> Cedrus Corporation
> http://www.cedrus.com/
-- 
ian eyberg
    
    
More information about the antlr-interest
mailing list