AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Measuring your chatbot’s performance
 
 

Background: since I put my bot up here, I have been delighted to see some people actually visiting and talking to it. Still about one person per day and they give up after Yoko’s many ‘I dont understand’ reactions, but it’s great for learning and improving nonetheless.

Now, since my bot (like all bots) is based on trying to ‘understand’ an input phrase and make the most of it, I can (and do) store with each conversation phrase whether Yoko managed to understand it. And make even a chart of it, here’s how that looks: http://www.yokobot.com/index.php?p=data&s=conversations

On to my question. Let’s forget about Turing tests or contests for a moment. Since we have a lot of the internet chatbot ‘big shots’ on here, I wonder if you have any statistics, charts, data… On how well your bot is performing?

Surely, within something like AIML some responses can be considered more ‘succesful (‘what is your favorite *’ -> [name favorite from list of *]) than others (‘blablabla blah blah’ > [‘Interesting. Why do you think that blablabla blah blah?’].

Another, and more straightforward to implement measure, would be conversation length for example.

Do you guys regularly build and look at these measures for your chatbot? And have you seen a steady increase as time went on?

(it should go without saying that, well, my chatbot performs awful by any measure)

 

 
  [ # 1 ]

I do not keep statistics, but I have seen a rise in conversation length over time. I do also track failed inputs that generate a default random response. I do regularly get conversations that can span a couple of hours. And I still get some that start off on the wrong track from the beginning and end abruptly. A few versions back I put in an opening that has the bot ask the user to spend 15 minutes with it. It also try’s very hard to not let a user leave the conversation if he says goodbye.

One rule of thumb, if your bot repeats the same response in about 10 volleys, there is about an 80%chance the user will leave a couple of volleys later.

 

 
  [ # 2 ]

That’s interesting, Wouter.  There aren’t enough hours in the day for me to compile and display the information the way you have, but I do pay attention to the counters and from spending time with chatlogs, I get an impression of the length of the conversations. 

My experience is a lot like Merlin’s.  Some chats perform better than I ever expected, some never get off the ground, especially when a visitor can’t spell well enough to complete a single sentence without an error.  They get offended at being corrected and leave.

One additional point.  I have several bots, some that accept and participate in adult conversation, and some that don’t.  Those that do have very long conversations and many more of them, I suppose because people tell their friends.

 

 
  [ # 3 ]

longest conversation (over multiple logins) is 43 hours -  9000+ volleys.
No way to measure how well it does in accuracy… can only point you to ChatBotBattles 2012 for best 15 minute conversation won by Angela.

 

 
  [ # 4 ]

Thanks guys, this is very interesting. Funny to see that adult conversation does a lot better, maybe there’s some interesting ‘marketing’ opportunities for a bot in there smile

I can’t help but find myself thinking, dear god, if only y’all would put all your chatlogs somewhere online for all to see and learn from! *drool* smile

 

 
  [ # 5 ]

You can always talk to angela online on Facebook.  https://apps.facebook.com/talking-angela

 

 
  [ # 6 ]
Wouter Smet - Aug 23, 2013:

I can’t help but find myself thinking, dear god, if only y’all would put all your chatlogs somewhere online for all to see and learn from! *drool* smile

I’m not sure how helpful it will be, but if you send me an email (using the “Email Dave” link under my avater), or a message via G+, I can set uou up a limited admin account to be able to view Morti’s chat logs.

[edit]
Actually, I just remembered that I created a log file viewer that’s publicly available:

http://www.geekcavecreations.com/Morti/conversationLog.php

Enjoy! cheese
[/edit]

 

 
  [ # 7 ]

Wouter - How are you measuring accuracy? Is it a manual job?

I use AIML and there is a training interface in Pandorabots which shows how many times the default category was called, usually by people saying nonsense. I sometimes check this to pick up messages that need the answers amending.

I think the longest conversation someone had with Mitsuku was around 6-7 hours in one session.

 

 
  [ # 8 ]

nice graph wouter!! Did you know you can add images to your posts as well? Would be nice for next time…

 

 
  [ # 9 ]

Thanks Erwin smile It’s a small graph, but I linked it rather than embedded as an image (good to know!) because it’s dynamic, it is built straight from the last 7 days’ worth of chatlogs. If I ever get lots more people chatting to Yoko, and thus more and richer data, I hereby vow to make a post here full of images visualizing different aspects of the data!

(for example, how much do people talk about YOKO versus the rest of the world, do people talk more in present than past tense… Actually, come to think of it, I should just go through some chatlogs of ANY source between 2 people, not between people and bots for this - we’re aiming for Turing after all!)

@Steve, it’s automatic, a phrase can simply be ‘understood’ or ‘not understood’ by Yoko, and I store that information (along with - what - it was understood to mean) with each phrase, so that’s all I’m counting. I wasn’t sure whether AIML would have a similar measure, but apparently with the ‘default’ category it does.

As for logs, the interesting part for me wouldn’t be the bots’ replies, but more what people try to say to it typically, so I could improve my bot to anticipate those things. So…

Dave, thanks a lot for your generosity, that’s every bit as useful as I hoped it would be! If you wonder who was suddenly hitting your server so hard this morning, that would have been me smile I took the liberty of converting your log viewing page to one big text file for easier reading, here: http://www.yokobot.com/lib/mortilogs.txt

(I’ll of course gladly remove this on request if you want)

 

 
  [ # 10 ]
Wouter Smet - Aug 23, 2013:

I can’t help but find myself thinking, dear god, if only y’all would put all your chatlogs somewhere online for all to see and learn from! *drool* smile

user input data (Warning, NSFL!)

 

 
  [ # 11 ]

@Wouter: I’d love to host an area on Chatbots.org where people can upload their transscripts, linking to their chatbots, liknking to awards, etcetera… How cool would that be?

I’d like to run a crowd sourced project on this. Interested???

 

 
  [ # 12 ]

Oh dear, Carl, pretty NSFL indeed! If people already are willing to spend time talking like that to (admittedly pretty impressive) current-state chat bots, imagine the big business a true turing-test beating computer would mean in this area… Hadn’t really thought about that!

@Erwin, that would be a pretty interesting area indeed! Especially since the first search result when googling ‘chatbot transcript’ lead to this very forum, with this thread expressing the use such an area would have: http://www.chatbots.org/ai_zone/viewthread/228/

As for crowd sourcing, not really sure about the best way to go about it. Ideally you would have a huge repository of transcripts, all in some standardized structured format (with metadata like time and which chatbot) and easy to browse and download a data dump from… But a first pass could be simply a page to collect various links to transcripts?

Not sure how easy it is for the platform chatbots.org to set up a wiki page that some people (say, registered members of the forum, or maybe only moderators for now) can edit and attach files to?

 

 
  [ # 13 ]

Oh, and @Erwin, it goes without saying: sure, if you would give - me - access to such a page, I would gladly put in some initial work to search, group, collect and link from various chatbot transcripts from around the web to give this project a nice starting point!

 

 
  [ # 14 ]

wiki is already installed, not difficult from a technical point of view. However, the community needs to be large and active enough to make this really a vibrant area.

For the transscripts, I’d prefer to have an upload button with some intelligent interpreations of the lines, and some intelligent formatting for easy reading…

We have a Virtual Private Server with a lot of space, so that shouldn’t be the problem.

Do you happen to have time and are you in for a challenge???? Would be good to establish your name in this community grin

 

 
  [ # 15 ]

As tempting as becoming the owner of such a project may be for establishing my name, I’m afraid I’m not really in a position to make big promises on how I devote my time in the near future. I’m literally between careers (though I do happen to be currently building a specific-purpose online community for a client, so it is the kind of task that sounds fit for me smile), life decisions and even countries at the moment.

So while the idea of this project definitely excites me, it would be too risky to say ‘I’ll do it!’. Rather, I think a better approach would be a gradual one, where we initially aim for the goal of creating a central starting point for those looking for chatbot transcripts, as simple as a page of links.

As for the community aspect, a ‘transcripts’ thread, or better, perhaps an entire forum section (like ‘rivescript’) on this forum seems like an obvious (initial) place to fuel that. I definitely agree that this, as soon as some nice amounts of transcripts start pooring in, has the potential to stir lots of lively conversation on its own, we only have to think about the fun ‘data’ question that started this thread!

Parallel to that we can also get a discussion going about what should be contained in a ‘standard’ representation of chat transcripts, and how they should look, which is a necessary prerequisite to creating that ‘intelligent formating’ from different sources into that one format. Note that the best way will probably be an extremely simple one: just plain text with line-by line transcripts, preceded by some meta-information about the chatbot, and the time and context (‘Loebner prize 2011’) of the conversation.

Rather than starting to program a ‘universal transcript upload button’ right away, we can see in what formats different contributors wish to submit their transcripts, and I’ll be happy to get cracking on some scripts that convert those (and on others that are simply found already online), just like the script I wrote to turn all of morti’s transcripts in one big text file.

This initial ‘manual’ approach seems especially appropriate because if we can get some of the bigger chatbots on board, the logs they have available probably make up huge files, so having processing done just at the clicking of a button may not be feasible!

So that would be my proposal: wiki (possibly just one page initially) + dedicated forum area (possible only one thread initially) + ‘individual’ processing and uploading of various different transcripts. And then see where it leads from there. How does that sound?

 

 1 2 > 
1 of 2
 
  login or register to react
‹‹ The internet of things      Aiml with HTML ››