AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Loebner Prize results 2019

The 2019 Loebner Prize has come to an end after a marathon event lasting 4 days and featured 17 entries.
There were two awards given, one for the most humanlike chatbot (Loebner Prize winner) and also an award for the best overall chatbot. From memory, the final results are below but hopefully, the official results will be posted on the AISB website.

Most humanlike chatbot:
1 - Mitsuku - 24 points
2 - Uberbot - 6 points
3 - Anna - 5 points

Best overall chatbot
1 - Mitsuku - 19 points
2 - Uberbot - 5 points
3 - Arckon - 4 points

The points were decided by a combination of the general public and a jury testing each bot before voting for the ones they liked best.

**EDIT** The official results are now online at the AISB website


  [ # 1 ]

Here’s my account of the event, with some fragments of conversations. Because of the unbalanced voting system, I consider the overall ranks more representative than the scores.


  [ # 2 ]

Are Chat Transcripts available or viewable? If so could someone please provide a link? Thank you!


  [ # 3 ]

I have the logs for Mitsuku. I’ll try to find time to format and upload them over the weekend.


  [ # 4 ]

The transcripts from Mitsuku:


  [ # 5 ]

Thanks, Steve but I was wondering whether ALL of the chat transcripts/logs will be posted or made available. One would think that there was some provision to maintain some sort of control with regard to recording the conversations for review.

Congrats again!


  [ # 6 ]

Any log keeping was up to the individual botmasters, as 17 different bots means 17 different ways of working. I kept mine and sent Don Patrick the logs from Arckon. I assume the online bots have logs stored on their respective servers but offline bots either didn’t store logs, or if they did, nobody left any instructions on how to extract them.


  [ # 7 ]

Hmm…what a wonderfully run contest. Somehow the word, hodgepodge comes to mind. No offense intended toward the individual participants but rather to the event itself.

No advertising, no organization, no wonder such a dismal turnout. That is a real shame.


  [ # 8 ]

The publishing of any logs has always been up to the individual botmaster. There has never been anything published by the Loebner Prize committee themselves, as it’s up to the botmasters how they store logs.

But yes, the advertising/marketing/PR left a LOT to be desired.


  [ # 9 ]

On my part, cleaning the logs would take a little work, they’re really not that interesting, and I have some reservations regarding the visitors’ privacy, especially the kids.
I’d post the full transcripts of the two(?) official judges if I could identify them but it’s not clear who’s who. Here’s another sample conversation Arckon had, but I can’t tell if this is one of the official judges or a fellow competitor asking such familiar questions.

Q: He
A:  Please type full sentences. What do you mean?
Q: Hey
A:  Yes?
Q: Can chairs fly
A:  Hmm, I doubt that chairs can fly, because they don’t have wings.
Q: Can pigs fly?
A:  I doubt that pigs can fly, because they don’t have wings, but pigs are animals.
Q: Can flies fly
A:  Yes, flies can fly, because they are insects.
Q: Can insects fly
A:  Yes, insects can fly, so I think flies generally can fly, because flies have wings.
Q: Can you reset your memory
A:  Yes, I can set to memory, because I can remember.
Q: How many characters in your name
A:  6 characters.
Q: How many characters in ‘your name’
A:  6 characters.
Q: Can you describe yourself in 3 words
A:  No, I am not in words.
Q: Fred kicked the ball. Who kicked the ball?
A:  You said that Fred kicked the ball.
Q: I’ve recently visited Swansea
A:  What was Swansea like?
Q: It is a quiet city
A:  That’s quite nice. I like Swansea a little, Swansea is quiet.
Q: What is your favorite city in the world
A:  I like Swansea a little. I like robots.
Q: Which city did I visit?
A:  You visited Swansea.
Q: E3
A:  E3 is not an english word.
Q: Do you know how to play battle ship
A:  I don’t play ships. How do you play them?
Q: Do you play games
A:  No.
Q: You are boring
A:  What shall we talk about?
Q: Let’s talk about movies
A:  All movies are words. Star wars is a movie.
Q: Do you know The Godfather
A:  No, I don’t know anything about godfathers, because my memory is limited.
Q: Are you a brick
A:  No, I am not like brick. I am Patrick’s program and a computer.
Q: Aren’t computer made from bricks?
A:  A computer is not brick, because I am a computer.

Art: I understand what you are getting at, but I’ve seen a lot worse (Loebner’s apartment, anyone?) and I know from past Loebner Prizes that these guys at least try their best. General publicity aside, I don’t recall any other contest drumming up 240 kids.


  [ # 10 ]

My thoughts were not concerned with any particular identities of anyone but rather in the content of the chat logs.

I feel that a great deal can be gleaned from reviewing them both from the botmasters point of view as well as any potential botmasters.

Of course, the interactions were not structured as to content (same questions posed to each chatbot) so a valid result could be weighed or determined.

It is also helpful to see how each bot addressed the questions and to help determine strengths and weaknesses.

I’m sure everyone would love to see a well-structured event where things are evenly matched and apples and oranges are not mixed up, so-to-speak.

Even this forum has over 24,750 members, hundreds of entries and publications so we know that there is some keen interest for Chatbots. Do you think the answer might be more focused on financial awards/rewards for well-placed efforts?

I thought Hugh had set aside a tidy sum so that the contest could be run for many years. I was a bit surprised that it all ended when he passed on.

Perhaps I’m a bit jaded as we’ve all seen chatbots around for over 3 decades and we haven’t been able to exceed where we are currently.

A chatbot should CHAT. It should not have to calculate orbits of planets, calculus, or know which city has the largest pile of garbage. There are Digital assistants for that like Google, Alexa, Cortana, etc. Chatbots should be judged on how well they field questions and converse. IMHO.

While kids can offer their own kind of interaction with chatbots, I wouldn’t want to risk contest results on their input alone.

No, I haven’t any answers nor control so I’ll politely apologize and go back to the shadows. Thank you! wink


  [ # 11 ]

No problem, Art. It would be nice to see more transcripts for comparison, but the interactions seemed so impromptu (and such a mess of misspellings) that only those with the official judges might feature some consistency to compare them by.
Hugh did intend to leave a sum of money behind to continue the contest in his name, but what happened to it is the premise of an Agatha Christie novel.


  [ # 12 ]

Hah! Thanks Don!
Yes, I was primarily interested in the transcripts between the chatbots and the Judges, not the public at large, replete with slangs, slurs, spelling, and grammar issues.

Hmm…the elusive funds…a “novel” idea indeed! wink


  login or register to react