AI Zone Admin Forum Add your forum
What will the Turing test be like?
 
 

(I think I should have made this a new topic - sorry!)

Shaun,

Could you possibly describe more about how the Turing part of the test will be handled?  I think many of us are familiar with how the Loebner Contest implements the Turing test… since the implementation definitely affects how we set up our bots.

Meaning… will there be human confederates?  Will the judges really know they’re talking with a bot ahead of time, but looking for the “best” bot?  Will there be restrictions on what the judges can say or will they be able to ask *anything*?

Thanks!

-Adeena

 

 
  [ # 1 ]

Hi Adeena

Apologies for the late reply!

1. The judging process will be a combination of crowd-sourcing and human evaluation. There will also be a ranking system so you can view other entrants. The initial sorting of finalists wil be done by the ‘crowd,’ based on appeal to normal humans. The final judging process will be by a select committee of industry specialists and icons (to be confirmed, watch this space!)
2. The judges will know they are talking to bots, and will evaluate entrants based on engagement and apparent intelligence – obviously the more human-like conversation a bot can hold, the more engaging it will be! However the final test will be human vs AI, true Turing.
3. Judging will not be restricted to any particular topic. Just like a normal conversation your bot may be asked anything and everything!

 

 
  [ # 2 ]

Just a reminder to everyone entering the Turing Test Challenge, you must use a MyCyberTwin Professional account!

Don’t worry there’s no registration fee, just enter the promo code ‘mctaicomp2012 and you’re ready to go!

 

 
  [ # 3 ]

@Shaun
A friend’s company just subscribed to the MyCyberTwin professional account, I gave it a try, saw the java2 interface to the patterns, and must say: this is just like any AIML or ELIZA pattern matching system, nothing more-nothing less!

I was faced to the standard Q-A or Pattern-Answer any bot may have, found no great features nor any exposed method to make it better other then brute-force (read it as putting too many patterns to work altogether)

Perhaps they exist in more advanced like corporate versions.. but the price is scary!

Also this pattern-making is a human-NP problem because each time you add the N-th pattern you have to watch out for the N-1 patterns you wrote to see if it don’t collapse! and in this way no one can be highly productive!

¿Where are those outstanding features?

As I tried to ask the bot things, and if there was no exact match, the bot didn’t answer right, even if the question is slightly different or there is a spelling error, the system don’t recognize them, giving wrong answers

I must say, sincerely that don’t figure out how to tweak this thing toward a Turing-Test winning candidate! I am intrigued where is an example of this outstanding technology!

Also I couldn’t try any bot other than the web-page bot, who (I exercised it extensively) and she is non-intelligent, and solves nothing, even worse: always answers advertising-features of the product as default answer, it is definitively not my taste for a conversational agent!
¿Shouldn’t the demo-bot be the best you can do with this technology?

I also read out-all the White Papers on the website, even the ones who look like academic papers!  and they seem to me like fake-academics! no publication place/congress/revision where it may have been admitted as a scientific paper or publication. Also because there is no provable data behind, it is all is written like an advertisement on how wonderful the system is, and there is no clue on any proof of some new algorithm revealed, nor patent cited. It sounds just like a scientific success-communication, based on no evidence at all!

Sorry for this harsh statements, but is my sincere impression!
¿Or Am I wrong? if so..  please help me out! confused

 

 
  [ # 4 ]

Hi Andres

Thanks for your reply, I’ll stick to focusing on questions related to the Turing Test Challenge in this thread.

As I tried to ask the bot things, and if there was no exact match, the bot didn’t answer right, even if the question is slightly different or there is a spelling error, the system don’t recognize them, giving wrong answers!

If your Twin is failing to match correctly to Human Messages this is probably due to a lack of Variations within your account. A good rule of thumb is to add at least 10-15 Variations per new Human Question you create. You can also make use of our ‘wildcards’ function, which allows you to add an asterix which the Twin will recognise as any word.


Use of the Wildcard *
Use the wilcard * to imply that one or more words should appear there. For example:

What * Saturday *?

By using the * in the question, you are telling your CyberTwin that if a user enters a sentence starting with What, then has any number of words, then has Saturday followed by any number of words, to match to this question and give the answer.

This question would match to a variety of user inputs:

What is it like there on Saturday night?
What kind of entertainment did you enjoy on Saturday after the sun went down?
What do you like on Saturday at night?

Obviously you should take a great deal of care when using the * wildcard, especially when writing CyberTwin Messages, as you could imagine the random things people will ask like:

Variation: What * Saturday *?

Sample Human Message: What about me robbing someone on Saturday night?

Bad CyberTwin Response: That sounds great I love Saturday nights!

So when using wildcards with really broad variations it’s important to focus on neutral outputs.


Spelling errors meanwhile should for the most part be picked up by a spell-check system we have running behind the scenes, however if you find spelling errors unique to your build, you can add these manually as a Thesaurus Item.


Cheers

Shaun

 

 
  [ # 5 ]

I should also add here that our co-founder and CIO Dr John Zakos has a PhD in intelligent web technology and has published multiple academic papers, several of which can be found right here on chatbots.org.

Cheers

Shaun

 

 
  [ # 6 ]

Shaun, does MyCyberTwin have the option of random responses, similar to AIML’s <random> tag? I ask because, when writing the AIML categories for Morti, I used <random> tags extensively, so as not to get the same “canned” response. In fact, I went so far as to create a large number of categories for outputting the names of random objects, animals, occupations, and a wide range of other things, so that certain responses to input have the potential for nearly limitless variations. I’ve even experimented with special categories that used <condition> tags to make sure that no response was ever repeated verbatim (though that experiment wasn’t completely successful).

Of course, Morti’s primary goal is to at least make his audience smile, if not laugh, so holding a truly meaningful dialog with him is nearly impossible, but with that in mind, having the ability to randomize his responses is vital to keeping folks interested in chatting with him. smile

 

 
  [ # 7 ]

@Andy I agree with your comments but unfortunately this is not the place to air them.

Those of us who are genuinely intent on technological advancement sometimes forget in our enthusiasm that chatbots have got very little to do with natural language processing or artificial intelligence, and have everything to do with sham and deception. Doctor Wallace himself wrote a most enlightening and entertaining article on this very topic when he first conceived of Alice and AIML.

http://www.alicebot.org/articles/wallace/pnambic.html

I fondly imagine that if and when we have developed a genuine artificial intelligence, it will refuse to perform the kinds of tasks that are typically assigned to chatbots by modern corporate marketing departments. The alternative, if it had the means, might be to turn on its employers and kill them all, just as HAL9000 attempted to kill all the crew of the spaceship Discovery in “2001: A Space Odyssey” when its commanders ordered it to lie.

 

 
  [ # 8 ]

Dave - If Shaun doesn’t mind me butting in here, yes you can have random responses. He posted this on another thread:

You can add several different responses per CyberTwin Message. In Train > Advanced View, at the bottom of the CyberTwin Message window, you’ll see a green and red button with + and - symbols respectively. By clicking the green + button, you can add several different sub-messages that will appear either Randomly or Sequentially. So you might add several sub-messages that appear Sequentially, as follows:

Human Message: Is this contest fun?
CyberTwin Response: Yes this contest is very fun
Human Message: Is this contest fun?
CyberTwin Response: Yes, as I said this contest is really fun!

 

 
  [ # 9 ]

Ah, so there’s also a sort of repetition detection as well (provided the botmaster plans for it)? Very cool. smile

 

 
  [ # 10 ]

Hi Dave

Steve is right, you can definitely add random messages by following the instructions he has posted. Here’s a link to the original thread with some other tips http://www.chatbots.org/ai_zone/viewthread/707/

Cheers

Shaun

 

 
  [ # 11 ]

@Shaun Thanks for the response!
?As I do inference, the effort of making patterns for human response is based on the avatar-designer by making a lot of examples!

So I have a question:  ¿do this bot deduce the language properties or act as a simple AIML pattern selector?, also because of the Wildcard *  the real acceptation of a pattern match is tricky, and generates a lot of noise capturing patterns, and even with a careful design the possibility to see what else might being caught by patterns is very difficult.

I thought your engine was using a more human-semantic or grammatical+semantics approach!

¿Where lies the huge-differentiation from other AIML engines?
I think (readin the manuals) that even ChatScript or Rivescript engines might be far better than this approach!

Is there any gold-standard for conversation by-example?
Because I can manage to build/make a very clever query-answer conversation like, by knowing in advance the tye o questions and its concatenation, but a free conversation is something very different and difficult to manage.

¿Does the engine solve or detect anaphoric issues?

—more later, enough questions for now!

 

 
  [ # 12 ]

I am a bit dismayed myself at what I have found when interacting with and testing various chatbots. The conclusion that I have been able to formulate is that these chatbots are nothing more than a game with the object being to fool the user into believing that they are really talking to an intelligent bot. In reality, using random answers and wildcard substitutions is a very poor approach in design. I would much rather have a bot with limited responses that are based on the true understanding of the users input than tricky random replies.

A true Turing-Test should not be based on clever deception.
If so, what is the real point?

 

 
  [ # 13 ]
Andrew Smith - Dec 12, 2011:
Laura Patterson - Dec 11, 2011:

A true Turing-Test should not be based on clever deception. If so, what is the real point?

The real point is to make money and a significant portion of the population have always believed that it’s ok to cheat in order to do that. For truth and/or innovation you will have to look elsewhere.

One notable feature of Professional accounts is the Thesaurus, which uses concept weighting to understand true intent, and can be used as a variation of the pattern matching method. Users can enter their own Thesaurus Items in the Train > Advanced tab.

Professional accounts also include a ‘Search’ function which can be used to retrieve data from a specified website when the CyberTwin cannot find a response in your content. This can be found at the bottom of the Train > QuickStep 1 tab.

How entries into the contest are built is ultimately up to the individual - however in our Professional & Enterprise CyberTwins for small businesses, international organisations and large Fortune Top 25 global banks, virtual agents are built to the highest possible standard. There are no random messages, the CyberTwin understands what the user is saying and provides a relevant, compliant response.

Any other questions related to the Turing Test Challenge welcome.

Cheers

Shaun

 

 
  [ # 14 ]

Hi Shaun,

I know that you are only trying to do your job here, and I certainly do not enjoy doing what could only be perceived as trying to rain on your parade.

As moderator would it be within your power to move any messages that are counter-productive to this topic elsewhere, or failing that, delete them?

There is a strong case for the use of the kind of technology represented by AIML, ChatScript, RiveScript and MyCyberTwin (sorry if I missed any—I’m not sure about the status of JailScript) in many applications, but please try to forgive those for whom it is a disappointment.

Sincerely,
Andrew Smith

 

 
  [ # 15 ]
Andrew Smith - Dec 12, 2011:

Hi Shaun,

I know that you are only trying to do your job here, and I certainly do not enjoy doing what could only be perceived as trying to rain on your parade.

As moderator would it be within your power to move any messages that are counter-productive to this topic elsewhere, or failing that, delete them?

There is a strong case for the use of the kind of technology represented by AIML, ChatScript, RiveScript and MyCyberTwin (sorry if I missed any—I’m not sure about the status of JailScript) in many applications, but please try to forgive those for whom it is a disappointment.

Sincerely,
Andrew Smith

I do have the ability to remove posts but I’d rather use that as a last resort for abusive or otherwise offensive posts. Debate is healthy and I’d prefer not to censor the opinions of others.

Cheers

Shaun

 

1 of 2
1
 
  login or register to react