AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Hello
 
 

I am new here and it looks like a great site
Do you have any criteria and or tests that can be used to test if any particular chatbot is any good or is getting better, or some particular system to test how good any system is working.

 

 
  [ # 1 ]

Hello, Kevin, and welcome to chatbots.org. smile

In all honesty, there’s nothing of that sort that’s generally available. Mostly, everyone here just does their own research (or not, as their preference dictates) into the various systems, and has/will give their opinions, or not. Many of us here have our own independent projects, basically trying our “own paths”, so there is, understandably, a great disparity of opinion. It should also be noted that very few of us have “comprehensive” knowledge and experience with every platform and application available, and fewer still who have made public comparisons.

However, all of that notwithstanding, it would be very good if someone were to take the time and make the effort to do so. Sadly, the time constraints that I’m currently under prevent me from performing such a task. Perhaps you may know of someone who has the time and/or desire to do it? smile

 

 
  [ # 2 ]

There are some tests here if you want to run your bot through its paces. I believe they were devised by someone here.
http://vhmats.iwarp.com/rich_text_2.html

 

 
  [ # 3 ]

And this is why they should never have handed me the keys to this place. smile

Thanks for that, Steve. I was unaware of that. smile

 

 
  [ # 4 ]

Do you have regular or yearly competitions? is there a top 10 or the like ?

 

 
  [ # 5 ]

You can still get in under the wire for the Robo Chat Challenge I believe

VLG

 

 
  [ # 6 ]

Thanks for the reference, Steve. I wasn’t sure anyone accessed my site, since it has only been shared with forum members.

The VHMATS site is designed to evaluate the level of proficiency of agents performing functions documented in the Mind Map diagram at http://mindmap.iwarp.com/.  The test questions are not complete, but give an indication of what a standard test set could become.  I would like to see something similar developed for chatbots.

I welcome any suggestions for additional test questions.

 

 
  [ # 7 ]

This is a very naive / simplified idea there are a lot of ifs and much of it would need further investigation to implement it,
IF some form of in house testing standard was created we could get all humans to take it and all Chatbots as well. If the stored data was made available to all, I think it would be very useful,
If we then had say; a monthly ranking of Bots, and published the results, it would create some very healthy competition and drive, Not that I don’t believe that most members here have plenty of drive,
Putting your Bot up for retest would give you the feedback on how you are progressing
The problem here would be that you need some Intelligent entity to do the assessment ,or instead maybe we could get the input of all members in a voting system (BOT OR NOT) on a scale 1 TO 10, 10= HUMAN, 1 = BOT Then we could get some very healthy in-house completion and some very useful data. I believe the feedback would be very helpful in creating clearer targets and goals for creators and help drive the field
The content of the test could be gathered from members by submission of possible questions and procedures
I know this would take some time and effort to set up but might be worth it if enough members felt it was worthwhile we could get a team together made up from members
Any way just voicing some of my thoughts,

 

 
  [ # 8 ]

I rather like the idea, personally. You could also use questions and/or transcripts from other chatbot competitions, as well (with permission of the organizers of said competitions, of course!). One problem that I can see is that some chatbot hosting services don’t allow automated conversations with some or all of the chatbots they host, as it’s a violation of their Terms of Service. there’s also the challenge of devising a “one size fits all” interface API - no small feat, there. Trust me, I’ve tried. smile

Still, I think your ideas have merit, and deserve further discussion.

 

 
  [ # 9 ]
Toborman - Nov 11, 2012:

Thanks for the reference, Steve. I wasn’t sure anyone accessed my site, since it has only been shared with forum members.

I found it a very useful site and helped me in designing how Mitsuku works.

Kevin Titcombe - Nov 11, 2012:

... maybe we could get the input of all members in a voting system (BOT OR NOT) on a scale 1 TO 10, 10= HUMAN, 1 = BOT

I don’t see the point in making it “human”, as it means we deliberately have to dumb it down so the bot only guesses at questions like “What is the population of Russia?” or “What is the square root of 15?”. Many people who tried the Turing Test version of my bot have said it was human purely because of deliberate typos, backspacing and errors it makes. I think we should concentrate on making bots as smart as possible and not try to emulate a human.

 

 
  [ # 10 ]

You’re right, Steve, in that we shouldn’t make out chatbots “mimic” humans, nor should they be judged on the accuracy of such questions, either. After all, how many “humans” know the square root of 15, or even know off the top of their heads how to punch in the correct sequence of keys on a calculator to get the right answer? Nor do I think that we should have our bots emulate human imperfection through “tricks” like you’ve mentioned.

However, what we can (and should) strive for is to make our chatbots feel human, or at least “real”, and that’s a much more difficult and elusive goal. After having chatted with Mitsuku a great many times, I think that you’ve achieved this, to some degree, and I hope that I’ve done the same with Morti (though I know I’ve still got a VERY long way to go, there).

Judging that sort of “feeling” requires a certain level of subjectivity, I think, and is virtually impossible to quantify and/or qualify, but I also think that it’s worth the effort, don’t you? smile

 

 
  login or register to react
‹‹ JERVIS NEEDS HELP !!!      just me ››