AI Zone Admin Forum Add your forum
Conversational Chatbot Contest
 
 

While passing the Turing test may be the ultimate holy grail of NLP, it seems to me that along the way there are other grails that should receive some recognition. In looking over the Loebner prize and the CBC, it seems to me that too much emphasis ultimately falls on the knowledge-base underlying the system. Specifically, if one knows that their bot is going to be asked “simple” questions on any subject of general knowledge, the range is pretty overwhelming.

But, my purpose is not to diss the existing contests, but rather to suggest that it would be nice to have a contest that instead focused on the general conversational skills of each chatbot. (Perhaps such a contest exists that I am unaware of?)

Basically, the idea would be to have judges that attempt to interact with a given bot conversationally, without trying to trick it in any way. Rather, they try to play along as if they were a person who accepted prima facia that they were interacting with an intelligent and aware “being”.

If the bot was programmed for general conversation, the judge could start by talking about the weather and then go wherever the conversation led. If the bot focused on a specific subject, the judge would play along within that area—for example, in interacting with Eliza as psychoanalyst, the judge’s score would be based on how well the analysis went, how varied and insightful Eliza’s responses were, etc.

I guess the reason for me to suggest such a contest is that I am working on a program that will attempt to have a philosophical conversation with the user. Therefore, I’m unlikely to spend much time programming it to deal in any sophisticated manner with a user that babbles on about some unrelated subject like sports, etc.

How would such a contest be scored? Perhaps others who find this idea interesting can weigh in on this thread. Here are some criteria that come to mind:

(1) Grammatical and semantic accuracy, especially when referring to material previously input by the user.

(2) Accuracy in responding to what the user is saying.

(3) Depth of comments, such that the bot gives at least some impression of intelligence and insight.

(4) Longevity, i.e., how long a bot “lasts” before it becomes painfully obvious that it’s not as smart as one might have hoped.

Eulalio

 

 
  [ # 1 ]

There are a number of us, myself included, who feel that more competitions, regardless of the nature of them, couldn’t help but be useful in helping us improve and refine our bots. But how would one judge a bot like Morti, who’s primary goal right now is simply to entertain the visitor, rather than to fool a judge into thinking it’s human, or to know a lot about any given subject? Morti says a lot of things that don’t necessarily “fit” the criteria listed above, so I would maybe add to your list something about how much the judge smiled. smile

 

 
  [ # 2 ]

I agree entirely, Dave. In fact, that’s exactly what I would like to see, a contest with more emphasis on the enjoyment the bot gives. Indeed, there’s a great art to programming a bot so that it can deal with input that it doesn’t understand without totally falling apart. Clever diversions, subject changes, references to past comments, etc. can all deftly do the job. And, if you think about it, what could be more human than avoiding a subject one doesn’t wish (or isn’t able to) opine concerning?!

That said, the second criteria would be modified to something like:

(2) Apparent accuracy in responding to what the user is saying, even if that means sidestepping the actual content. e.g.,

USER: r u a Knicks fan
BOT [internally: What the crap is this idiot babbling about?]: [externally] Fair question. Maybe yes, maybe no. Are you?

How is this accomplished?

(1) Convert common online shorthand to get “Are you a Knicks fan”.

(2) Recognize the inversion of the verb to be + pronoun, indicating that this is a yes/no question about the bot.

(3) Respond by showing recognition that you’ve been asked a question even though the user left off the q-mark. “Fair question.”

(4) Indicate that you know it’s a yes/no question as opposed to a fill-in-the-blank question like “What are you having for dinner?” Respond so as to avoid answering: “Maybe yes, maybe no.”

(5) Finish by throwing the question back at the user with something that would fit virtually any “Are you ...” question: “Are you?”

Bottom line: A lot can be accomplished without any reference whatsoever to a knowledge base.

 

 
  [ # 3 ]
Eulalio Paul Cane - May 10, 2011:

I agree entirely, Dave. In fact, that’s exactly what I would like to see, a contest with more emphasis on the enjoyment the bot gives. Indeed, there’s a great art to programming a bot so that it can deal with input that it doesn’t understand without totally falling apart. Clever diversions, subject changes, references to past comments, etc. can all deftly do the job. And, if you think about it, what could be more human than avoiding a subject one doesn’t wish (or isn’t able to) opine concerning?!

That said, the second criteria would be modified to something like:

(2) Apparent accuracy in responding to what the user is saying, even if that means sidestepping the actual content. e.g.,

USER: r u a Knicks fan
BOT [internally: What the crap is this idiot babbling about?]: [externally] Fair question. Maybe yes, maybe no. Are you?

How is this accomplished?

(1) Convert common online shorthand to get “Are you a Knicks fan”.

(2) Recognize the inversion of the verb to be + pronoun, indicating that this is a yes/no question about the bot.

(3) Respond by showing recognition that you’ve been asked a question even though the user left off the q-mark. “Fair question.”

(4) Indicate that you know it’s a yes/no question as opposed to a fill-in-the-blank question like “What are you having for dinner?” Respond so as to avoid answering: “Maybe yes, maybe no.”

(5) Finish by throwing the question back at the user with something that would fit virtually any “Are you ...” question: “Are you?”

Bottom line: A lot can be accomplished without any reference whatsoever to a knowledge base.

I agree and posted some idea (don’t remember where) think on the Loebner.. thread

There may be a Goal-contest
So for example a Capability of

1) Robustness on input (Habdling wrong words, and being able to tell them apart from unknown real words)
2) Understanding weird phrases (non grammatical)
3) Handling derivative and inflected words.
4) Finding co-reference (Anaphoric Reasoning)
5) Handling the whole conversation turn-takings
6) Making deductions, based upon user inputs (shallow reasoning)
7) doing math, and logic
8) getting good and creative answers, even if he don’t knows the fact or response. (and not a fancy random-answer mix up algorithm)
9) being able to ‘refine’ concepts and ask details to complete an idea or a reasoning.

and many .. many more

 

 

 

 
  [ # 4 ]

Good input, Andy. I guess the bottom line is that he who makes the contest (and puts up the prize money) gets to make the rules. But maybe in time, with enough feedback, a more subjective contest of this sort might get started. Perhaps some of us here can start one ourselves with no prizes other than gentleperson’s honor!

 

 
  login or register to react