AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

A chatbot 30% better than Mitsuku

Google working on human-like chatbots that contextually respond to anything :



  [ # 1 ]

Interesting but they are not allowing anyone to talk to the chatbot to validate their claims.

I hope I don’t sound like sour grapes, as I’m genuinely interested in this chatbot and hope it is as good as they claim but let’s assume for a moment that it wasn’t Google who had posted this. To me, the post reads:

“Hi all. I’ve created a new measurement for chatbots and announced myself as the best at this measurement. My work is about 30% better than the next best chatbots. However, I’m not letting anyone talk to it.”

We see these posts regularly on chatbot forums and usually just laugh them off as a joke with a “please repost when you have something concrete to backup your claims”.
A while ago, Google claimed to have developed something amazing called Duplex. This too was heralded (by them) as the next best thing. It later turned out that 25% of calls handled by Duplex were dealt with by real people and Duplex slowly sank out of the news.

I hope it is genuine, as chatbot development has progressed little since the ELIZA days but until something is released that we can actually try, I’ll reserve judgment on it for now.


  [ # 2 ]


  [ # 3 ]

I mainly find their scoring method worth questioning. It’s easy to score high on one’s own metrics and I think sensible responses should be a prerequisite more than part of an average. Having said that, it is a step forward to automated judgement of conversation quality, and that would be useful to chatbot creators.

The ability to converse freely in natural language is one of the hallmarks of human intelligence, and is likely a requirement for true artificial intelligence

I am disturbed by professionals using the term “true artificial intelligence” which until now was the go-to term for people who couldn’t be bothered to define what they were talking about. Equally surprising is it to see them call conversational ability a requirement for intelligence while everybody knows full well that replicating the results is not the same as replicating the (intelligent) processes, ever since Chess computers succeeded with brute force. The Bit in the 80’s movie Tron could only say yes and no, but did so intelligently.

The last time Google claimed something like this with a similar approach, their model didn’t process the word “not” correctly, nor passive tense, both of which completely change the meaning of a sentence. I do think the overall results look impressive as long as you don’t get too specific and don’t mind the occasional contradiction, but the willful ignorance of their notions severely undermines my respect for their achievement.


  [ # 4 ]

These questions are friendly to the Meena team.  Politely saying… Oh my goodness!

Did they really just compare Mitsuku running on an ordinary Central Processing Unit (CPU),
to Meena running on a Tensor Processing Unit (TPU) for artificial intelligence?

Are you kidding me?  This is so entertaining, actually.  I consider this good, fun news to chat about.

Did they forget Mitsuku is using 2 or more cores, while, Meena is using 2,048 Tensor Processing Unit (TPU) v3 cores?

I can’t believe Steve Worswick successfully took on a gigantic tech company like this.


  [ # 5 ]

Thanks smile
Mitsuku takes up 40Mb of space and runs in about 4Mb of memory. It would be interesting to compare specs for Meena.


  [ # 6 ]

Google’s Sensibleness and Specificity Average(SSA) takes 2 scores and averages them. This score seems to corelate to perplexity. But, perplexity has never been a great scoring metric for neural nets trying to be chatbots.

Humans rate 86% on SSA. They hit 97% on Sensibleness (which means 75% on Specificity).
For me, this means that humans always say something sensible, but only about 3/4 of the time do they say something specific. It would seem that more work needs to be done on the scoring metric. This makes sense to anyone who has had a conversation with both Mitsuku and Cleverbot. It is hard to believe that they both score the same.

Meena should be a better bot than other neural net trained bot. The number of parameters (2.6 billion) and size of the data set (341GB) was the largest to date. Google trained Meena for 30 days on a TPUv3 Pod (2,048 TPU cores, 16GB memory each) on the Meena dataset. Cost of this project looks to be in the millions.

As a reference, Skynet-AI core is about 1.3MB and runs on anything that supports a JavaScript enabled web browser.


  [ # 7 ]

Ahhh, the fictional bot, also known as google’s meena. 

I find it ironic that google says they have the best bot, but nobody can see it in the wild.
No way to replicate it.  I do not think you can say you are the best at anything unless it can be independently verified.

They claim that they cannot make it available for “security reasons” or some other reason.  And the source is not available.  I call BS on that.

I will not believe it until I see it.

Plus,  their study included conversations with only fellow chatbot data scientists.  That is unbalanced in the least.

Even if this is real, the premise of their bot is flawed from the outset. Basically, it is supposed to return the most common response to anything you ask it.  What does this give you? A common response to anything you ask of it. Where is the personality, the back story, the point of view?  Sorry to be so negative, but I am not buying it.


  [ # 8 ]

Good day everyone,  please where can I get a compilation of questions thrown at chatbots?
I’ve so far been able to get some transcripts of some Loebner prize contest, but I’d like to have a larger database that I can work on. Thanks.


  login or register to react