AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Chatbot Battles is the best bot contest yet!

Steve Worswick has done an excellent job creating a lot of excitement with the Chatbot Battles contest.

I realized this morning that Steve has discovered five key features that make it a lot more exciting than previous chatbot contests:

1. Bot vs. Bot. Steve has really figured out that these contests are about bots vs. bots, not bots vs. human.  The Turing Test is famous but as a contest format it is flawed because there is no real way to measure whether bot can “beat” a human in conversation.  Conversation is not a game like chess or Jeopardy.  For the botmasters, the Loebner contest has always been about picking the best bot.  The Loebner contest is structured so that the judges are trying to “out” the bot. 

2. Low Entry Barrier.  The barrier to entry is low.  There are far fewer technical hurdles than the Loebner contest. Steve got 42 working entries.  Compared to 14 entries to the Loebner, and only 5 of those were tested.  More entries means more attention and more excitement around the contest. 

3. Sports League Scoring.  Steve has also figured out the best competition format devised so far.  He’s based it on the soccer World Cup.  You can read the details of the league/knockout schedule on the Chatbot battles pages.  The effect on the potential audience is electrifying.  Rather than wait around for the contest to run, wait for the judges to tally the results, and then see a final list of scores, we get to see the partial results as soon as the matches are played.

4. Rapid Reporting. (This is related to 3).  The posting of scores is nearly real time.  This makes it more like a sporting event where you can see the latest scores in the paper (or online) almost as soon as the games are played.  For the types of people who follow Major League Baseball or the World Cup, watching the individual matches and tracking the scores is addictive. 

5. Lots of matches. The 42 bots are divided into 7 leagues of 6 each.  Within a league, each bot is matched with every other bot, so there are 6 choose 2 = 15 “battles” within in each league.  All told there are 105 league battles + 16 more battles in the knockout stages.  This ongoing competition will keep people going back to the site to check the scores.

The video about choosing the leagues with lottery balls, and Steve’s knowledgeable introduction of each competitor, also contributed to the excitement.

This contest has an excellent shot at eclipsing all other chatbot contests. I can easily imagine it evolving into a bigger event, moving out of the virtual world and into the real world as a live webcast event.


  [ # 1 ]

I have to agree with Richard. Steve has done a great job and the format makes it more exciting.


  [ # 2 ]

Yep, I second that.

I like the fact that it’s an ongoing thing. it has already pushed me way beyond my comfort zone. As always, I started much too late (for building/preparing my entry). Then, I over-pushed it on pattern complexity, making it pretty obvious where the system still needs more work. When it’s a 1 shot event, I usually say ‘[censored] it’, it is what it is, lets role the dice. Now of course, I still have time to correct things, so I’m working my ass off, which has already pushed things much further in a shorter period. It’s really like a race, a programming race that is…
I probably wont get through the first round, but the bot leaving the competition wont be the same as the one that entered it.
I have been thinking though about the amount of testing required for a competition with this setup. You need some dedicated judges to go through all those games.

PS: posting this message litteraly took 5 minutes. What are you guys doing on this server?


  [ # 3 ]

Not me, Jan! I’m just playing some Minesweeper, while I unwind from the day. I just noticed an email notification that you’ve posted. smile

I’ve been sort of avoiding the chatbot battles, mostly because I feel bad that circumstances have prevented me from getting back to the things that I’ve wanted to do, such as organizing a new chatbot contest myself. I must say, however, that Steve’s set up a top notch competition, in very much the same style as what I was leaning towards, but has done a far better job than I could have hoped to have done. Well done, Steve, and here’s to your contest being every bit of a success that it looks to be shaping up to! smile


  [ # 4 ]

Yes fantastic job Steve. smile

So who’s running a book then ? wink

BTW, Jan, yes this site is really slooooww for me too.

Good luck to all the battlers !


  [ # 5 ]

Thanks guys. You do realise that all these kind words won’t get you any extra points though tongue wink

It has been fun so far with very few problems, although saying that, it appears Personalityforge is down which may affect the very tight schedule.

Chatbot Battles is something I had in mind for a long while and coded a few pages a year or so back but not bothered doing anything serious with it. I only decided to finish it off and put it online after the farce that was this year’s Loebner Prize and the demise of the CBC, as I couldn’t see any competitions which were accessible to anyone who wanted to compete,  without them having to jump through crazy pre-requisite hoops.

The judges (my family, friends and work colleagues) have been great so far. The free flowing matches are subjective but I am more than happy with the way things are going. Next year, I may arrange a panel to score the free flowing conversations.

As for running a book, my money would have to be on Eugene Goostman so far. He is the only one to score the full 5 points in a battle. However, this was a conversation battle and so he might not fair as well in the Q&A sessions.


  [ # 6 ]
Jan Bogaerts - Jun 13, 2012:

I like the fact that it’s an ongoing thing. it has already pushed me way beyond my comfort zone. It’s really like a race, a programming race that is…

I agree, I let my bot get a bit stale over the past 6 months as I took the Stanford AI courses. It probably cost me a version. The push of the competition has made me rush to include new updates (and has of course broken some working things) so I can survive my ‘bracket of death’. A good competition always makes my bot better as a result.

Good Luck to all.


  [ # 7 ]

One of the beauties of a league format is that it isn’t necessary to win every game. Just like a sports team, you are allowed to have your off-days every now and then.


  [ # 8 ]

Yeah, I like this format better than what I had envisioned. I was going more for the NBA playoffs, rather than the… what you came up with. smile


  [ # 9 ]


I agree with Richard and all that this is a very exciting format for a chatbot contest. I hope you can continue this long enough that I can build a contender smile  Seriously, I was greatly impressed with the league selection video, and though I’m not very knowledgeable about sports, I can see how your narration and commentary, really bring into play that element of draft picks for athletes and your broad knowledge of the existing chatbots out there and entering the contest. I think that feature is definitely a keeper, and builds the excitement for the upcoming battles.

I have to say that my money is on Eugene Goostman. That is one solid bot with depth, background, knowledge, personality, chat ability, you name it.  Vladimir and his team have put some serious work into him.  He is also my favorite in the upcoming University of Reading (Turing 100) Turing tests on June 23rd. 

Lastly, to Richard, Simon/JFRED/Landru (hard for me to pick a name) is one of my multibots. It has JFRED script, Eliza, AIML AAA 2003, The Professor’s Encyclopedia, and a few evolved remnants from Albert in there. Whether they are coherent or not remains to be seen, but they are all represented as called for by the JFRED kernel.

So AIML made it to Bletchley 100 after all.



  [ # 10 ]

Thanks guys. You do realise that all these kind words won’t get you any extra points though

Darnit, wasted my typing again. LOL


  [ # 11 ]

42 entries?  That’s got to be a fix.grin


  [ # 12 ]

No it was perfectly genuine. The format of the contest works as long as there isn’t a prime number of entrants or a number that doesn’t easily split. The breakdown of the numbers around 42 would have been as follows:

39 = 13 groups of 3
40 = 8 groups of 5
41 = prime number - panic!
42 = 7 groups of 6
43 = prime number - panic!
44 = 11 groups of 4
45 = 9 groups of 5

If it were a prime or a silly one like 13 groups of 3, I would have encouraged a few more to enter just to make the numbers up. I was sweating a bit when it stayed on 41 entrants for ages but luckily Tom Joyce entered another one of his creations. Cheers Tom!


  [ # 13 ]

Have you considered the possibility of running this contest more than once per year, Steve? Quarterly contests may be a bit much, but twice per year might be fun. cheese

Either way, I’ve got to start getting Morgaine ready, not to mention “shining up” Morti just a bit. smile


  [ # 14 ]

Probably not Dave, as this has taken up about 2 hours of my day so far since the battles began and I forsee this going for a few weeks yet. I have to manually update the website with the results and check that the judges are all ok by answering their queries and generally making sure everything is running to schedule.

Also there are 105 league battles of which 42 consist of 5 Q&As; in each. This is 210 questions to think of. I am making up ALL 210 of these questions, as I am aware of what bots can and cannot answer. I was afraid that if I let the judges set the questions (most, if not all, of who do not have a chatbot) , I would end up with nonsense such as this one which was asked in the CBC one year:

What does feel the poet when he says “as people feel brave/i fall
onto my knees/on the bathroom floor/as people rite imortel/lines of
poetry i take/my fingers out the back/of my throat and wounder/at how
to go about vomiting/into the toilet pan”?

To create this many questions and keep the site in order is something I only fancy doing once a year.


  [ # 15 ]
Steve Worswick - Jun 14, 2012:

No it was perfectly genuine.

Of course. I didn’t mean to cast aspersions.  It was a reference to Life, The Universe and Everything.  Now I know that the numbers either side are prime, I feel Douglas Adams is doubly smiling down on Chatbot Battles.



 1 2 3 >  Last ›
1 of 4
  login or register to react