AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Hi AI experts and fellow bot enthusiasts! Introducing my bot Yoko
 
 

Ok guys, this is a big step for me, but after weeks of fascinated and admiring reading around here, I’ve decided that being able to share/hear opinions of you AI experts and fellow chatbot enthusiasts, is a lot more worth to me than keeping my project ‘top secret’.

Too many cool projects never see the light of day because of a delusioned founder trying to prevent his ‘world changing project’ from being stolen and instead it just goes nowhere instead. Would be silly to see that happen to Yoko.

So, well, errr, tataah: here’s another chat bot project smile I guess it’s a bit different from many others here in that it is not (yet?) AIML based, built 100% from the ground up by me instead, and by reasoning ‘on my own’ for the most part (though I’ve started to read up on current and past research/projects in chatbottery and boy is it fascinating!).

It’s distinguishing ‘design feature’ would be mainly that it tries to keep the ‘understanding how the world works’ and ‘communicating about it’ aspects as separate as possible by design. Nothing original here, though for some reason this approach is pretty hard to find. While it is often making my life harder and I find myself wishing I could just skip the ‘understanding’ step and go straight from input to pattern to output, I am trying to be disciplined about it.

As it’s all PHP, I do have some ‘gimmicks’ in there, like Yoko being able to solve algebra pretty well (something I’ve seen discussed here as being difficult to do with AIML), or trying to parse out the wikipedia definition if she doesn’t understand a word. But these I view more as spielerei than what actually matters (as most of you will agree I think).

My chatbot can be found at yokobot.com (http://www.yokobot.com). I have opened up the chat for all to play around with (prepare to be disappointed smile). The other sections I am a bit more ‘touchy’ about, but if you’ve made it this far here surely you’re the kind of person: you can access those sections using username ‘chatbots’ and password ‘org’.

SO MY QUESTION TO Y’ALL: if you have a couple of minutes to spare, have a look at Yoko’s ‘about’ section, which is chaotic but extremely detailed, I would love to hear your thoughts!

 

 
  [ # 1 ]

Maybe some more stuff that may be interesting to know:

- performance is terrible, which is because it’s an aspect I am deliberately ignoring for now. Each phrase typically requires a boatload of mysql SELECT statements, does some writes for logging, loads and parses some json files, etc…. Algorithms first, performance next, that’s my motto for now smile

- feel free to ask me any questions, I’d love to start a conversation! The topics currently most on my mind are:

- getting Yoko a ‘life’ so she can talk about events happening and have a sense of causality and ‘feelings’ about events.
- maintaining ‘state’ in the conversation, including meaning of ‘he’/‘it’/‘they’ etc etc
- choosing cleverly what to reply with between the many ‘reactions’ the system typically generates from a single input phrase. Currently she just blurts it all out.

 

 
  [ # 2 ]

Welcome Wouter.

I did not find performance of Yoko to be a limitation.

The transcripts presented at the web site are not the same as what went on during the conversation. I don’t know if the difference in logs is intentional or not.

But I would suggest a bit more work on her “self knowledge”, so that she understands things like her own name.

Good luck.

 

 
  [ # 3 ]

Aha, good point, and thanks for taking the time to have a look at her!

It’s been one of the rather shocking things for me to realize indeed, after only a handful of chats: people want to talk about HER all the time. Will do!

 

 
  [ # 4 ]

When you put an avatar with a name up, unlike a plain input form, the user will attempt to engage the bot as they would with another human.

 

 
  [ # 5 ]

Welcome to chatbots.org Wouter. You’ve made a very good start with Yoko and I hope you’ll continue to develop “it” further. (Sorry, I still can’t bring myself to anthropomorphise software, not my own or anybody else’s.)

While your modular design is not so common among enthusiasts because of its complexity and the amount of effort that’s required to get it off the ground, I believe it is the right way to go, and you’ll find there are a few of us who are indeed working that way and who might be able to help you.

I tried talking to Yoko but couldn’t get it to recognise anything that I said. It happens to everyone the first few times, the main thing is to get data that you can use to improve it and debug it.

One comment about your notes, there was a lot of good work done in the seventies and it was never really abandoned. Mostly it just had to be put aside until computers became more powerful. For example a really big new computer at my university when I was there had a whopping 2 megabytes of memory!

When parsing tables for even the simplest of grammars could take up hundreds of kilobytes, people didn’t get very far with symbolic natural language processing methods back then. Nowadays it is not unusual for me to be working with multi-gigabyte parsing tables and grammars with millions of rules. Some projects like Watson and OpenCog need 80 gigabytes of RAM just to load the software and never mind how much disk space for the data.

 

 
  [ # 6 ]

Welcome. Your chatbot seems commendable, I liked seeing that its “interpretation” shows it to possess some depth of understanding.
I must say you and I seem to have a very similar approach grin , be it with distincly different aims. I may have to share some of my secrets with you as well, though I don’t wish to influence you too much. You’ll eventually discover that English grammar isn’t as strictly structured as we were taught in school, but basic grammar gets you a long way.
As to some of your difficulties, it helped me to look up rules of spelling and word morphology until I noticed common patterns that I could program. And for “he/she/it” etc, I found that it most often refers to the latest-mentioned subject or object of that type (person or thing), so since you’re using grammar, perhaps Yoko could remember a small list of recently mentioned subjects that one might later refer to, and then translate “he” to the latest appropriate word listed.
I encourage you to continue working on the “understanding” phase. It may feel like slow progress, but ultimately I think that is what people really want out of chatbots, and what they have most been lacking.

 

 
  [ # 7 ]

Yoko said “multiple instances of Yoko exist.” or something, so apparently she couldn’t determine which Yoko I was talking about?

I hope you don’t mind, I was inspired by your “my cat died -> that sucks” scenario. It occurred to me that I could formulate an “opinion” answer instead of a factual answer if I connected a few existing functions of my program. I hadn’t thought of it yet because implementing opinion is years away on my to-do list, and even when I do so it won’t be with as rich a personality as your project.
In return, a few suggestions: If you are parsing Wikipedia, you may find it helpful to know that there is a Simple Wikipedia with easier sentence structure.
Another point: Words like “hoped” aren’t all that irregular, they still follow a common spelling rule. To find the matching verb in Yoko’s vocabulary, one simple way is to take off the “-d” and check if the remaining “hope” is a class, if not, then take off the “-ed” entirely and check if “hop” is a class. Similar things could be done for plural -s/-es/-ies and -ie/-y. Plenty of inconsistencies remain though, it just depends on how far you’re willing to go.

 

 
  [ # 8 ]

I don’t wish to seem disrespectful, but you might want to “skip the ‘understanding’ step and go straight to the from input to pattern to output” part.  At least until you get off the ground.  As it stands, I don’t have the sense that you’ll get many return visitors, or word of mouth traffic… people telling their friends about Yoko.  And, you need traffic to get better.

Above all, you need to chat with other bots and get an idea of what people expect when they visit.  They’ll tell you when you’re doing well and when you’re failing.  I’m sorry to say that I didn’t feel as though there was anything of value in my conversation with Yoko.  From your introduction, I was expecting more.

Yoko: hi You can call me Thunder., nice to meet you

You: What’s my name?

Yoko: errr - come again?

You: What is my name?

Yoko: I don’t know.

You: My name is Thunder.

Yoko: sorry, that didn’t make sense to me.

You: How fat are you?

Yoko: dunno.

You: What’s your name?

Yoko: I don’t understand.

You: Please introduce yourself.

Yoko: say what now?

You: Are you new?

Yoko: ok you lost me.

 

 
  [ # 9 ]

Hi Wouter, welcome to the forum! smile

Your project sounds pretty cool! I’m glad you decided to share it with us. I love hearing about new approaches to chatbot design, and it looks like you’re putting a lot of thought into it.

I also had a similar encounter with your bot as Thunder, but I admire your fortitude in putting Yoko out there and involving others in your creative process. I look forward to seeing the project develop.

Some of the problems you’ve mentioned with word morphology (verb tenses, plurals, etc.) have already been handled pretty well with open software. I’m working in python, so I’m not sure what’s out there for php. But for example, you might want to check out the Nodebox Linguistics package. It may at least provide some ideas.

 

 
  [ # 10 ]

(I’m splitting my post in two because it grew into a monster. This is the rambly bit…)

I like your approach of generating classes for different objects and using class inheritance for hyponyms. Are you generating these classes yourself? Can your bot (or would you like it to) generate these classes and their attributes from natural language?

I’m working on a bot “from the ground up” as well. I’m trying to keep its knowledge base as free-form as possible (as close to grammatically correct English as possible). The general problem with structured ontologies is that they tend to only capture a sub-set of relationships between objects/actions/etc. Using natural language as the database structure, we necessarily get around that restriction.

The catch is that you must be able to accurately parse your natural language input in the first place. And then you must have a more extensive (or, even better, flexible) logic system to utilize that database for information retrieval. (In our case, for generating a chat response.)

The reason I’m interested in your class/instance system—and in your notes on the duration of states as well—is that I’m in the process of designing another level above the ontology system for my bot. I call this the “story” level (I think I’ve mentioned it in the past in other threads…), where facts from the knowledge base are integrated with contextual information. For example, grouping together different knowledge base objects that represent the same object or action in a conversation. Or grouping objects together in a time sequence. One could even treat a “story” as an object within another story. (The story of how to spread peanut butter on bread is part of the timeline of the story of making a pb&j, for instance.)

After spending quite a bit of time (I don’t have much to spare for this project, alas) re-working my ontology data structures (done done done! smile ), I want to be sure I have a good design in mind before implementing the “story” level. (So far I’ve just played around with toy versions.) Again, the danger here is that by being overly-restrictive on the structure of the information, you risk losing out on relevant data.

Anyway, this post rambled on longer than I intended. smile I’d be curious to hear your thoughts on data structures.

 

 
  [ # 11 ]

Allright guys, this has been awesome - thanks for having a look at my project, and sharing your thoughts! I already am convinced that jumping in with you was a great step. Least I can do is provide y’all with extremely detailed answers with my reactions to your comments, so I am going to do just that. Sorry if this bores the sh*t out of you smile CR, if you thought your post was long, hold your horses for this one smile

Look for @yourname in here to just read my reactions to your specific comment.

@Merlin

The transcripts presented at the web site are not the same as what went on during the conversation. I don’t know if the difference in logs is intentional or not.

This is a bug - recent change caused only the user’s phrases but not Yoko’s to be stored. Thanks for catching it, fixed! (not on older convo’s unfortunately).

When you put an avatar with a name up, unlike a plain input form, the user will attempt to engage the bot as they would with another human.

True, and in fact deliberate. Like all of you I dream of eventually reaching full ‘Turing-mode’, and if leveraging the https://en.wikipedia.org/wiki/ELIZA_effect helps with that, I’m using it smile

@Merlin

One comment about your notes, there was a lot of good work done in the seventies and it was never really abandoned. Mostly it just had to be put aside until computers became more powerful. For example a really big new computer at my university when I was there had a whopping 2 megabytes of memory!

This has occurred to me as well - and it’s a bit of a shame that aiming for full conversational agents has become such a niche topic (even though I’m probably not aware of much of the research) in favor of statistical approaches, semantic web, translations, SIRI-like stuff and things like that. These days, computers have become so much more sophisticated and programming languages more accessible to the masses (not more expressive mind you, LISP still rules there, but sadly that’s far from mainstream) that surely there should be more interest than the same people participating the Loebner prize year after year.

Oh well, more chance for lone madmen like us to be the one to write history wink

@Thunder Walk:

I don’t wish to seem disrespectful, but you might want to “skip the ‘understanding’ step and go straight to the from input to pattern to output” part.  At least until you get off the ground.  As it stands, I don’t have the sense that you’ll get many return visitors, or word of mouth traffic… people telling their friends about Yoko.

No offense taken at all, I’m glad for your thoughts and that you took the time to have a look.

However, neither of these things interest me half as much as what fellow bot-thinkerers have to say at this point. For data/inspiration, at this point I can still turn to all those Loebner and other prizes chat transcripts there are (and many of those have indeed guided me a lot), and, well, I am going at this more from the ‘structuring the world and understanding language’ angle than the ‘hold a natural conversation’ angle anyway.

I obviously hope that at some point Yoko will reach a magic ‘turning point’ where she will become indistinguishable from a real human, and have perfectly original and natural ‘thoughts’, a sense of humor, and even a ‘life’, but I’m far from there yet.

You: What’s my name?
Yoko: errr - come again?

This is a wonderful illustration of why my separating language and knowledge approach makes creating Yoko so difficult, but - in my eyes - so much fun to think about smile So allow me to give a very very detailed breakdown of what your little “what’s my name” input brought about. Sorry again for the length.

FIRSTLY

- as I noted before, one huge central distinction I model Yoko’s world by are classes and instances. Unfortunately this is one of many things that are often hard to distinguish purely by phrase structure. (‘Coldplay are musicians’ versus ‘trumpettists are musicians’. So in my ‘baby grammar’ for now I use - among other things - CASE as a distinguisher: words that start with uppercase are seen as instances. Hennce non of your neatly capitalized phrases matching any of Yoko’s patterns smile

I just realized that QUESTIONS never start with instances though (think about it smile in a first approximation questions are wonderfully easy to detect in English, they start either with a question word like what/how/who… or with ‘do’, ‘does’, ‘would’, ‘should’ ‘are’ of ‘is’, ‘would’ and a couple more like that), so I refined my parser a bit to take this into account. Determining whether or not a phrase is a question or not happens early on in the parsing anyway, so this didn’t take too much time. Questions may from now on start with a capital without confusing Yoko!

Eventually I want to get rid of relying on case entirely of course (because we humans don’t), but not quite yet.

[continued in next post]

 

 
  [ # 12 ]

[CONTINUED]


SECONDLY
After fixing the above, internally ‘What’s my name’ gets simplified to ‘what is [user]‘s name’ as it should, which we can parse. However, the “what is [Something]‘s [something]” phrase only inquires about which specific instance from a class (name) has a ‘possession’ relation with the instance ‘the user Yoko’s talking to’. Basically these are ‘what is my name’ and ‘what is my job’ smile

HOWEVER, ‘what is Instance’s [xxx]’ in all other cases means ‘which value does Instance have for property [xxx]’. Like these: what is Instance’s mood, what is Instance’s color, what is Instance’s temperature, what is ... This is a different ‘meaning’: the first asks for a ‘possession’ relation, the second asks for a ‘value of property’ relation. Until now I defaulted to the second, more common, meaning.

(note that a much more common ‘which instance’ question structure starts with ‘who is’)

But, as is my reply always: no problem, language is messy, we’ll just teach Yoko that ‘what is my name’ and ‘what is my job’ are special cases of that phrase structure! Thanks for pointing this out!

So, fixing the 2 cases above: FIXED! (but what in AIML would take 3 lines, in my case it took an hour of coding in 3 different places - worth it!) Note that at the start of each convo Yoko already stored that the currently chatting user has a ‘name’ and that name is [whatever you entered] - http://grab.by/oWJa - your question just didn’t reach that piece of info yet. Generally, my ‘structure the db to store knowledge’ code is a little ahead of my ‘talk about that knowledge’ code smile

@Don Patrick

Yoko said “multiple instances of Yoko exist.” or something, so apparently she couldn’t determine which Yoko I was talking about?

Aha, you stumbled onto Yoko falling back to Wikipedia, and then hitting a disambiguation page smile

@C.R.Hunt.
Seems like we think among similar lines indeed! I’ve put a lot of thought into ‘story’ stuff as well for Yoko, and it seems to match very well what I do with ‘events’ and (and I consider this a major breaktrough in her development) above all ‘hypothetical events’! Yoko’s hypothetical events are/will be awesome, because they are used to store Yoko having feelings not just towards classes (‘spiders’) or instances (‘Nicolas Cage’) but also towards ‘certain stuff happening’ (the hypothetical event of ‘cats dying’)! What’s more, events and hypothetical are the key to storing causality, and ‘sequences’ of events, i.e. your stories smile I wrote on this in Yoko’s ‘about’ section, have a look! And : you can actually view Yoko’s entire data structure over here: http://www.yokobot.com/index.php?p=data&s=listtables.

As for

Can your bot (or would you like it to) generate these classes and their attributes from natural language?

Absolutely, yes. Far more than ‘responding to questions like a real chatbot’, so far the vast majority of my work on Yoko was wondering how to structure the world in a relational DB, and getting her to understand and store properly phrases like ‘cats are animals’ and the likes. So you can do stuff like tell her (one phrase at a time): ‘cats are mammals.’ then ‘mammals are animals’. ‘are cats animals?’ and she should know they are (and that you are the one that thought her). As I said, I reset Yoko’s knowledge constantly while developing.

I did recently start building a ‘basic knowledge’ file for Yoko, which is basically a series of phrases like that. In some future version I’m gonna stop teaching her myself, and instead go crazy on all those cumbersome WordNet/OpenCYC/AIML/Framenet/DBPedia… (see yoko’s ‘about’ section>resources smile) structured knowledge databases in the world, write a bunch of scripts to convert them to simple phrases like the above, and ‘feed’ all of those to Yoko. Here is how you can ‘upload’ data in Yoko: http://www.yokobot.com/lib/yokoknowledge_wine.json How’s that for readability smile Trying to get Yoko to ‘perform’ to real conversations is rather recent - the challenge is in ‘representing the world’ and ‘understanding language’.

So, C.R., if you want to think/talk data structures: have a look at Yoko’s about > data section to view 100% how I represent Yoko’s data (it’s a relational database in short). I made some small tools that try to visiualize what Yoko knows, for example entering ‘things’ (my broadest class) or ‘drinks’ in the ‘data’ section input field, it gives an idea of her basic class/instance hiearchy knowledge. Each of those classes has possessions (‘cars have wheels’), possible actions (‘objects can be moved’), events (‘Yoko talked to Wouter’) etc etc stored with them as well, but I didn’t get around to showing those on the chart yet.

I do store a lot of meta-information regarding actions/events, e.g. if it is an action that requires an ‘object’ (you give SOMETHING), etc…. Again, full disclosure if you look at the data tables listed.

I have to again fight my paranoia here about people stealing my ideas, and secretly wish somebody would give me a research grant or something for this, but fuck that, chat bots advancing would be awesome so it’s worth it (and mostly I’m not so arrogant as to think that nobody had my thoughts before, quite the opposite).

Anyway, thanks again for the feedback guys - fixes have been made based on it, but they will only apply to NEW convo’s. (I reset Yoko’s knowledge almost daily anyway while developing, it’s about the mechanisms more than the actual knowledge at this point).

 

 
  [ # 13 ]

For data/inspiration, at this point I can still turn to all those Loebner and other prizes chat transcripts.

Nothing wrong with that, I think most botmasters do the same.  However, I prefer to rely more on my chatlogs than contest transcripts, and for several reasons.

First, contest questions are created to test the limits of a chatbot’s abilities, and rarely have anything to do with natural (conversational) language, both in form and content.  Anyone who really wants to know the square root of 9 won’t be visiting a chatbot for the answer.

Secondly, contest questions are usually limited to a certain number, which generally doesn’t leave room for follow-up questions—and that’s where (I believe) you separate the good from the average.  With regard to the Loebner, Bruce Wilcox said it best. (http://www.chatbots.org/ai_zone/viewthread/1370/#14063)

Actually, the Loebner’s is a schizophrenic contest. You have to qualify by answering questions, which means there is no need to anticipate a follow up or deliver a conversational tone.  THEN, if you qualify in the top 4, you have to have a conversation with the judges, which has nothing whatsoever to do with question answering per se.

Lastly, chatlogs are created by real people talking in real time, and reflect how your visitors “speak”.  They probably don’t spend hours over the wording of a particular question to ensure that the meaning is clear and that it’s a “fare” question according to the rules.  For the most part, they reply the way they would to another person, not a judge who is weighing the answer for accuracy, or whether or not the response was delivered in a creative, clever, or humorous way.

I agree with most of what you said regarding questions.  But, one of my complaints about AIML is that it uses punctuation as sentence splitters, but doesn’t recognize their meanings.  Doctor Wallace explained it ( http://www.chatbots.org/ai_zone/viewthread/1226/#12964 ) but, the fact remains it’s frequently an issue for me.  When people offer input such as, “You like the Beatles?” or “That’s all there is?” wording that can be interpreted as statements as well as questions, my bots will sometimes ask for clarification.  “Are you asking me, or telling me?” or “Is that a statement, or a question?”  That usually prompts an insulting remark that includes, “The question mark was a clue.”

 

 
  [ # 14 ]

Just a quick note to say that, yes indeed, I must transcripts are helping me a lot!

(and making me realize hard how little Yoko still understands smile)

 

 
  [ # 15 ]

I think your Yoko project is great!  In other words, I can not say I agree with criticisms on this thread.  But I am not putting anyone down. This is an advanced chatbot design.  (Instead of “wine”, try “drinks” below.)

Stimulus:

Yoko, what do you know about wine?


Response:  (NOTE: flowchart graph converted to plain text for discussion purposes.)

Info about: wine Bordeaux red Brunello Chardonnay Chianti Merlot Pinot Blanc Rysling champagne Nicolas Feuillatte Veuve Clicquot.

Very cool domain name too!

Reference: http://www.yokobot.com/index.php?p=data

 

 1 2 > 
1 of 2
 
  login or register to react