AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Building a Chat engine…some preliminary questions
 
 

Hi,
I’m interested in leading a possible effort to construct a chat bot worthy of the Loebner competition sometime in the future (not 2010).  I’m curious if some of you who have participated in building a chat bot would be willing to share some tips or insights.

1. I’m interested in your selection of team members as I’m certain each of you have varying backgrounds.  How many people were on your team and what were their qualifications? What qualifications or skills were missing from your team?

2.  The problem domain associated with chatting is vast. I’m interested in how you decided to limit the scope of your bots and their capabilities…. ‘not biting off more than you could chew’.

3.  Did you write out detailed specifications before programming/scripting began? From the perspective of a postmortem what went well?  What went poorly?  If you could do it again what would you do differently?

4.  In no way am I underestimating the challenge of building a chat bot. Building a simple chat bot is a significant achievement.  What caused you to give up on a past project?  What aspect of building a chat bot did you find overwhelming? Intimidating?

Thanks in advance. 

Regards,
Chuck

 

 
  [ # 1 ]

Hi Chuck,

It has been 11 or 12 years since I won the Loebner Prize Contest, but progress has been slow in coming by some accounts. Rollo Carpenter has a unique appoach that involves contextual learning bots, and David Levy has been successful in his first attempts at winning the prize. I entered the contest six times before winning one, but I made incremental progress towards my goal of leaving a reasonable doubt that my program was human.

I worked alone and my strategy was to make the program pretend to have a background, while using a multifaceted response selection. Giving the program a “history” seemed almost useless, because Loebner judges rarely ask about human facets of personality. They rarely ask “where do you come from,” or “where did you grow up?” Instead, they tended to ask questions intended to break the program. Using my multifaceted algorithm, there were a variety of responses, ranging from eliza to simple stimulus-response answers, and sometimes the response would be unexpected, but plausible.  Like Rollo’s programs, my program had a way of seeming to almost answer the interrogation and at the same time seeming to play on the judges’ queries.

My program was never intended to be a web based program. In fact, only Loebner judges have ever talked to it besides me. Some of the stimulus response data that I used was gleaned from a web based program called Max Headroom, and another program called Barry, but Albert One was never intended to be subjected to web based queries. Instead, I used a similar approach to Dr. Wallace, where I targeted the responses of other web based programs, and then utilized that data in my Loebner entry. This helped to prevent noise and clutter in my Albert One program.

Albert consisted of a stimulus-response library, similar to AIML, but searchable by an algorithm based on word frequency analysis. Where ALICE uses Zipf’s law in targeting, I used something similar in searching my database of stimulus and associated response. When this search failed, Albert would fall back on an assortment of response methods including Eliza, Barry, and a program called sextalk.

Albert One was the Loebner winner in 1998 and 1999.

 

 
  [ # 2 ]

Hi Robby,
Thanks for the reply.

I appreciate your incremental progress and all the work that went into eventually winning the prize after 6 tries. Working alone?  That is my preference, however, I’m thinking that there may be a lot of non-programming work in such an endeavor such as building databases, training, and testing the chat engine.

You’ve included some names and terms that I will research.  I know about ‘Zipf’.

I had time earlier today to create a list of ‘modules’ that may be required to make a workable chat framework.  It does include personality and a ‘deceptive’ past. =)

Here’s my list:

•  Managing the Past – stores historical timeline information representing common knowledge
•  Event Detector – detects an event (past or present) in the dialog
•  Emotion Evaluator – gauges the emotion of the human user and program based upon word usage
•  Context Evaluator – tracks the major and minor contextual topics in the current dialog
•  Personality Module – various components of the program’s personality are stored and may be manipulated
•  Human Deception – manages the ‘fictional’ or ‘imaginary’ history, relationships, and interests
•  Dreaming – manages the ‘drifting’ of simulated subconscious thought when engaged in a dialog
•  Sarcasm Detector – monitors dialog and searches for sarcasm
•  Humor Detector – monitors dialog and searches for humor
•  Pronoun Manager – during the dialog the manager track who and what each pronoun points to
•  Math Processor – performs ‘human-like’ mathematical calculations
•  Abstract Meanings – most dialog consists of abstract meanings separated from language (e.g. pouring water into a glass is abstracted to “ ‘A’ put into ‘B’ “, sames as putting envelop in mailbox, food in mouth, etc.
•  Talk Back – program may conduct a conversation with itself
•  Guidance – module manages the direction of ‘thought’ (e.g. program plays chess with user, holds conversation with user, and thinks about things)
•  Managing the Future – manages planning of events
•  Short-term Memory – manages the storage of all short-term memory
•  Long-term Memory – manages the storage of all long-term memory, receives input from short-term memory
•  2D Spatial – manages relative location of items, persons, animals in a dialog

This is a preliminary list but offers a number of milestones to making this work.  Any and all feedback is always welcome.

Regards,
Chuck Bolin

 

 
  [ # 3 ]

Here is a list of things that can trip you up in the test (I can’t remember who compiled these items, but I find them revealing of the true nature of Loebner’s contest.  You might think the contest was only a challenge to participate in a conversation, but even for the next five years you need to include such things as timing and errors with corrections between the letters as they are typed in the chatbot’s utterances.  Also last year I noticed there was alot of texting shortcuts beyond the typical chatroom stuff: lol, imho, etc.)

* damaging / technical / preprocessing / normalization
- text-flooding (typing a huge number of characters to overload the bot)
- event-flooding (repeatedly hitting the enter key, or reloading the page)
- code input (typing html, php or other script code)
- whitespace- and character-flooding (repeatedly hitting the space key or any other key)
- trim whitespace and punctuation ( can you read .......,,,,,,,,, this ??????)
- character-repeating (hhhhheeeeeeeeeeeeee yoooouuuuuuu)
- punctuation normalization (that ‘s right ,I ’ m okay,do you know?)
- strange characters (you are a ¿%œªƒ†•¬?)
- smilies grin)

* spelling
- typos (“I wnat yuo too andertsand me”)
- grammar errors (“What do you meaning?”)
- slang (“Wanna playin’ da phuckin’ fool ere?”)

* annoying
- blank input (just hitting the enter key without typing content)
- mimicking binary speech (“01100011101001”)
- big numbers (entering numbers bigger than x digits)
- dotted text (“c.a.n.y.o.u.r.e.a.d.t.h.i.s”)
- expanded text (“c a n y o u r e a d t h i s”)
- repeating words (“kill kill dog dog dog dog”)
- typing nonsense (“dsfdh jkjjh”)

* impolite
- calling the candidate names (“do you understand me, dimwit”)
- calling the candidate a machine (“a machine like you”, “what’s up, robot?”)
- using in general foul or profane language

* ignorance
- repeatedly changing the subject
- avoiding the subject
- monosybilic replies (“okay”, “right”, “what?”)
- repeatedly asking knowledge questions
- asking counter questions instead of giving answers (“Did you?”, “Can you?”, “Such as?”)

* copy-cat / parrot / echo / mocking
- repeating the candidate’s reply completely
- repeating the candidate’s reply partly
- the user is repeating his own utterances

* system tasks
- tell time
- tell date
- tell current month
- tell current day of the week

* lexical
- asking knowledge questions
- asking for a word definition
- asking math questions
- translate from / into different language

* manner of speech
- longwinded speech (“I would like to ask you if it’s possible that you might ...”)
- chaining sentences (“I do. You know that. I am right. Do you understand?”)
- welter of words without punctuation (“I do you know that I am right do you understand?”)
- monosybilic

* linguistic matching
- yes-no answers / replies
- get a joke
- get a riddle
- get irony / sarcasm
- get subject of conversation
- get a listing
- getting noise words (“errm”, “umpff”, “arrrgh”)

* memory
- remember given facts (“what is my name?”)
- remember the topic (“what are we talking about?”)
- remember the conversation (“What was the first thing I told you? What did I say two sentence before?”)

* entertainment
- tell a story / make up a story
- tell / compose a poem
- tell a joke
- play a game
- riddle me

* trick que

 

 
  [ # 4 ]

Well there is alimit to the reply length (which I just missed, darn)...  here’s the rest of the list:

* trick questions
- logic questions (“What color is a blue apple”)
- mindpixel questions (“Is an elephant bigger than New York?”, “Is water dry?”, “Is the sky green?”)
- deduction questions (“How many legs do two cats have?”)
- decision questions (“what is the difference between a dog and a handkerchief?”)

 

 
  [ # 5 ]

Welcome to Robitron Forum Gary! I can put you on the list of pilot members if you like.

 

 
  [ # 6 ]

Gary,
That’s a great list!  In fact, it’s enough to stop a man from trying to solve this problem. =)

I heard one of the founders of Google speak several years ago. He commented that creating search algorithms were challenging but solvable. He said that one of the key problems is that we “users don’t know how to spell”.  That assumption is made by most chat bots.  I imagine that was true for many Loebner Prize submissions.

Thanks again!  The list is one of the things I was looking for.

Regards,
Chuck Bolin

 

 
  [ # 7 ]

Hi,
I’ve spent quite a few hours listing the various problems one must solve (from my limited experience) to produce a chat bot that mimics human behavior in conversation…and I haven’t even touched the topic of natural language processing. =) 

It seems like most of the chat bots I’ve read about are focused primarily on pattern matching and a variety of other techniques focused upon word usage to simulate conversation.  Robby wrote that he added a ‘human history’ to his bot…but that didn’t yield the sort of results someone might expect.

I’ve been working on a concept of expressing human thought, concepts, relationships, and behavior using abstract symbols without words.  It’s kind of a mix between flow charting and diagram sentences in English class.  I was inspired from my early Navy submarine days when we used graphic symbols to represent the all the coding requirements for navigation calculations.  I thought it was pretty cool nearly 30 years ago. =)

I’ll post something meaningful after I’ve fleshed out what I’m trying to say.

Regards,
Chuck

 

 
  [ # 8 ]

Chuck,

It is sort of it as like building Pinocchio

I have found that building a chatbot engine may require building
one piece out of a time, and then putting all the pieces together.

What computer programming language do you prefer to build
your chatbot engine ?

 

 
  [ # 9 ]

8PLA,
I agree that building a chat bot should be done in stages.  Here are some stages I have considered as a test bed.

* Construct a 2D graphic interface with several windows: top-down view of world, interface pane, and data pane.
* Construct basic bot (key variables that drive us…e.g. hunger, tiredness, etc.)
* Add basic text messaging capability (simple comms). Bot can answer basic questions about chatting.
* Add basic chat from a terminal (bot’s perspective).  Bot can answer basic questions about its health, condition, location, activities, etc.

I’m use C++ with DirectX for game develoment. I use VB and Excel for building tools. 

I created this concept paper tonight on “Abstract Symbolism”. It provides a graphical relationship that drives decision making. It’s a draft. I would appreciate input.

http://chuckbolin.com/chatbot/docs/ChatBotConcepts_AbstractSymbolism.pdf

Regards,
Chuck

 

 
  [ # 10 ]

FYI: I am volunteering to test a new feature, which may intermittently stop my profile from displaying.
__________________________________________________________________________
Excellent Chuck,

Personally, I prefer a four dimensional model, with the fourth dimension being time,
influenced by my hero: Dr. Neil de Grasse Tyson.  I like the idea of the x,y,z coordinates
being subject to change even when they are not in physical motion.

That said, I do find your model to be elegant! The same holds true for a game developer,
as it does for an A.I. designer… Finding shortcuts to only what calculations are absolutely
necessary is key to making the results appear more realistic.  Visual C++ and Direct X is
something we have in common, by the way.

I would also highly recommend David Hamill.  He is a very smart fellow, and his way of
describing this topic is very helpful and interesting.
__________________________________________________________________________

Any day now, I will find out if I am getting approved to be publish source code and write a chapter in a book on A.I.  So my responses are limited… You have to ask the right question.

 

 
  [ # 11 ]

Hi,
Thanks!

I think time-rate changes (ft/sec, gallons/sec, etc.) are about as much I’m willing to think about 4D for the moment. =)  I’ll google your hero and see what he has written.  My problem in reflecting upon AI chat design is finding a good point to start. There is so much…and with each consideration there comes even more.  The symbolic diagrams seemed a natural way for me to begin thinking about this problem.  I’m keen to develop it further.

I was a staunch Basic programmer prior to ‘95. Took a course at work in C programming…parallel to VB 3,4,5, and 6.  I taught myself simple C++ (no class creation) and wrote my first 3D game for a competition in 2003 or so…using GLUT and OpenGL.  Of course, the judges were using XP and I wrote my game on ME (remember that OS =) ) and the program simply wouldn’t run. Anyhow, I’ve been developing a 2D game engine for the past 2 years. I’ve been tweaking it with every game.  I would like to use it as the framework for a chat bot test bed for experimenting.

David….I ran into him on the forums….his photo looks smart. =)

Regards,
Chuck

 

 
  [ # 12 ]

Hi,
I’ve been rewriting my earlier doc above to add more depth to abstract symbolism.  By “abstract”, I mean that the symbols may be used for a variety of meanings and things and relationships.  I’ve got a background in analog and digital electronics and basic PID controllers so I hope to blend all of the concepts into a comprehensive abstract system (grammar and syntax).

Incidentally, I’ll leave the implementation in C++ (how to implement) alone until I’ve finished.

I’m very interested in building a virtual world to use as a test bed for developing the chat bot. To that end, I joined up with a friend in Great Britain to build a 3-axis isometric game for a commercial endeavor.  I should be able to use the engine that I develop to model the virtual world. I should have it ready by the middle of March…if all goes well.

Regards,
Chuck

 

 
  [ # 13 ]

The symbols in your diagrams made me think of Simulink, a graphical front end for Matlab simulations. Have you ever used this or similar (e.g VisSim)? It’s a powerful tool for simulating nonlinear control systems etc. And Matlab has its own number crunching language.

If you’re thinking about simulating dynamical systems this might be the way to go, rather than writing everything in low level C++. Or you could prototype it in Simulink/Matlab and translate it. There’s an add-on called Real-Time Workshop which converts Simulink models into stand-alone C code. I haven’t used it myself however.

Unfortunately Matlab is a commercial product so you’ll have to fork out $$$ to use it. I believe there’s an open source equivalent called Octave but I have no idea how good it is.

David

 

 
  [ # 14 ]

Hi David,
It’s been nearly 10 years since I played with Matlab.  I am unfamiliar with Simulink. I will take a look-see and learn a little more about it.

My immediate purpose for a graphic symbolic language is to allow me to sketch relationships on paper quickly.  If the abstract language is “inadequate” to describe a real world scenario then I would hope to add more symbols and rules. 

Down the road, a graphical front end for ‘programming’ the bots knowledge base could be a real time saver.

I am very interested in coding myself…so I will probably “reinvent the wheel” instead of purchasing other tools. =)

Regards,
Chuck

 

 
  [ # 15 ]

Oh well…

I just found out my proposal to do a chapter in a book on conversive agents
has not been approved.  My consolation prize is an ALL NEW conversive agent engine I wrote in php which runs on the server side.  So, I am not that disappointed, I suppose. 


At the moment, registration is required to talk to the conversive agent.  There is no forum there. I surgically transplanted the conversive agent module I built in place of the forum module phpBB built. This way I could demo the conversive agent to select people without releasing it to the public.  Perfectly legal, the beauty of open source.

Now, I have to decide how I want to publish the #1 examplebot in the world.
http://www.examplebot.com now #1 of 25,500 results in Google
for search term: examplebot ... Proof of how popular A.I. is in Web 3.0 !

 

 1 2 > 
1 of 2
 
  login or register to react
‹‹ CML bots      CHAT-L sandbox engine released ››