AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Project River—Need Starting Advice
 
 

Hello everyone! I’m obviously brand new to the forum, but I’m very glad to have found a site that is solely focused on my current area of interest.

A bit about me:
I’m a Web Programmer with a background in Computer Science.—Web just happens to be where the jobs were at.

I’m fluent in C and Java, as well as the PHP/MySQL/HTML/JavaScript suite. I’ve dabbled in Python and Lisp, and have a pretty good handle on C++.


Okay now for the exciting bit…


Project River:
———————-

Project River is my own pet project of creating a Smart House. I’m starting small though, and focusing on writing a Natural Language companion in C++.

While the end goal is to emulate something along the lines of “S.A.R.A.H.” from Eureka or Jane from Ender’s Game, I’m going to take it in baby steps.

River uses the Microsoft Speech API 5.3 along with voices from NeoSpeech to speak to the user.
She listens to the user currently by using SAPI5 as well, however I’m in the process of converting her to use Dragon Naturally Speaking for her speech recognition.


My primary focus for River is to make sure that she is capable of at least minimal learning. I want her to be able to slowly learn from every conversation she has with the user, as well as use WordNet, and other information sources to expand her knowledge.


Unlike the majority of chatbots that I read about, River will work solely in a single location and will have a vast amount of computing power available to her. She will only cater to one or two users at a time, which will allow the computing power to be more focused.


What I need is advice on where to start. I’ve read extensively today about ...

1. Pattern and Topic Matching - such as AIML and the core A.L.I.C.E. bot
2. Statistical Matching - Namely: http://courses.ischool.berkeley.edu/i256/f06/projects/bonniejc.pdf
3. First Word approach—first word is often an indicator for a good response.
4. Lowest Frequency Approach—words with lowest frequency are most significant and should help dictate response (related to statistical)

I’m very intrigued as to the science and math behind Victor’s CLUES V3 implementation, and would love more information about the methods he’s using.


Based on the background of what I’ve learned so far, and the fact I’m using C++—does anyone have any tips for me as to where I should start?

As a parting thought, I’ll leave you with one of the core ideas I hope to include in my engine. I intend to program in key phrases such as “that didn’t make any sense”, which would in turn trigger the bot to adjust it’s algorithm regarding the reply it just issued. In the simplest form it would mark that reply to never again be given for a similar question, but ideally it would do some work to figure out why it was a bad response.

 

 
  [ # 1 ]

Your programming skills sound better than mine.  However, it seems hard to get enthusiastic about doing something from scratch.  For instance, guile3d Denise is a pretty all inclusive product, and you can still customize the intelligence to your heart’s content.  In any case, I reckon modularity is the key to any conversational agent. My feeling is that for practical purposes the AIML Superbot is all most people will ever need for the vast majority of interaction, and is still highly customizable.  All the botmasters with learning chatbots seem overly squeamish about letting the general public use them, due to dragging them down to the lowest common denominator (dumbing-down effect).  Basically, if a chatbot can’t handle real-world situations, then what good is it.  Latent semantic indexing and probabilistic inference are likely more suited to big data, IMHO.  I think it is important to keep in mind that humanoid robotics is predicted to become the biggest industry the world has ever seen, the question is just when.  What I think this means for chatbot developers is that there will eventually be a great demand for a diversity of intelligence and character, maybe even more than for specific platforms.  I can foresee intelligence and character being translatable between platforms; so in that sense, “mindfiles” could prove to be a better investment of time and energy than platform specifics.  Like children, learning chatbots might take a long time to mature, and even then their direction can’t be guaranteed.  ;^)

- Marcus Endicott
  http://twitter.com/mendicott

 

 
  [ # 2 ]

Thanks for the input grin. Project Denise is certainly interesting, I’ll be keeping an eye on that. I’ll be more interested in it if they license out just the AI portion of it. I would love to find a good C++ library that would handle things like orthographic analyzing, etc.


By no means do I want to create my project “from scratch”. I want to use any libraries or help I can get— provided it suits my purposes. The problem I run into with the Superbot or something like Denise (aside from prohibitive cost), is the inability to easily work it into the rest of my project.


Because the end product will be focused on controlling a household environment (appliances, lights, managing inventory of things like paper towels, etc.), I have to make the base code primarily from scratch. I originally wanted to use Project Leaf as my base, but I determined that Lisp would be too impractical to use for a final implementation.


Anyone know of any C++ libraries that can do some of the common NPL tasks?


My thought thus far is…

1) Use statistical comparison to choose a best response, with extra weight given to words that less frequent.
2) If the statistical method does not find a proper response, use an AIML pattern matching system.
3) If no suitable pattern is found, apologize and ask for an appropriate response
    —This is not something you would do in a normal chatbot, but would work well for an application like this.
    —Trainer could give a new response, helping to shape the AI personality, etc.
4) User can provide feedback at any time saying things such as “that didn’t make sense” or “never say that again”
    —AI would then flag the response it gave as invalid or mark it as a banned phrase depending on feedback.
5) AI will record every response ever spoken to it as a valid response to what it had said.
    —Will need a decision engine to determine if it was a valid response or just an unrelated command.
    —Response will need to be gender-tweaked (if possible), so that the bot will keep it’s intended female tone of speech
6) AI will constantly search conversation-focused internet sites for new training material (Facebook, G-Talk)
7) Thesaurus functionality, and possibly word stemming, and ideally orthographic analysis.

Point 7 is my sticky point so far. I have a general outline for how I’ll get the rest of it started, but I have no idea how to accomplish #7, and would welcome any ideas or software utilities…

 

 

 
  [ # 3 ]

http://rebecca-aiml.sourceforge.net
- cross platform open source AIML development platform for C++

http://shark-project.sourceforge.net
- a modular C++ library for the design and optimization of adaptive systems

 

 
  [ # 4 ]

Fantastic! Many thanks Marcus. I look forward to diving into those over the weekend.

 

 
  [ # 5 ]

Hi,
You can find a lot of info over at AAI, which has an extensive online lib of AI related things linke NLP

You could also take a look at what I am doing on my blog.

 

 
  [ # 6 ]

Well, you’ve certainly taken on a wide load with this project. This may be the “boring” part, but I would recommend starting at the other end of the spectrum—integrating a bot program with some device and having the bot use set commands to do certain tasks related to that device. Maybe it just runs a toaster or perhaps it can tell you how old the milk in the fridge is. (Have all products that enter the fridge have their bar codes scanned first, and recorded along with the date?) Because frankly the NLP part is going to be far more challenging.

Then again, chatbot development is certainly the most interesting part (at least for me). As far as point 7 on your list goes, I recommend WordNet. I’m not familiar with C++ implementations (I use NLTK for python) but a quick google search led me here: http://www.koders.com/cpp/fid616AED2292EB48FC9512B10039A262B5A2400480.aspx

 

 
  [ # 7 ]

@Jan Thanks! I’ll read through both grin.

@CR Another master giving me advice, I’m humbled grin. (Your physics conversations in a recent thread made me dizzy.)
I’m actually starting with kind of what you mentioned. I’m starting even simpler though—I have the Microsoft Speech AI working, so I’ve starting doing simple macros for things on the computer, such as controlling Pandora radio. (I say “I like this song”, the AI replies “I like this song too”, and gives it a thumbs up rating, etc.—this is already working).

I’m going to hold off on making my appliances smart—for now. Since I live in an apartment right now, I’ll have to wait a bit before I do too much in that area. I may automate a George Foreman soon though. If I can build a simple wifi switch that the grill plugs into, I can easily control the power to the device, getting it preheated before I’m ready to make a chicken sandwich or something. Seems to be one of the easiest places to start for now.

Thanks for the WordNet advice, I’ve seen that mentioned in a lot of places. I hadn’t found a link as good as that one though! I’m going to read through it in just a bit here, and put a plan in place to put it in action. Thanks!

 

 
  [ # 8 ]
Garrett Griffin - Jan 8, 2011:

I’m actually starting with kind of what you mentioned. I’m starting even simpler though—I have the Microsoft Speech AI working, so I’ve starting doing simple macros for things on the computer, such as controlling Pandora radio. (I say “I like this song”, the AI replies “I like this song too”, and gives it a thumbs up rating, etc.—this is already working).

That’s great! It doesn’t really matter if the appliances are physical or not (heck, you could even have it control some simple “virtual home”). The key thing is the transition from specific commands (“I like this song”) to implied commands that the bot can interpret from casual conversation. (“I’m in the mood to relax tonight.” Bot: “Shall I put on some Bob Marley?”) But you can’t work on the latter without having some form of the former in place.

Garrett Griffin - Jan 8, 2011:

I’m going to hold off on making my appliances smart—for now. Since I live in an apartment right now, I’ll have to wait a bit before I do too much in that area. I may automate a George Foreman soon though. If I can build a simple wifi switch that the grill plugs into, I can easily control the power to the device, getting it preheated before I’m ready to make a chicken sandwich or something. Seems to be one of the easiest places to start for now.

Hacking appliances would be a fun hobby all in itself. Fortunately, bluetooth and wifi devices have become cheap enough that this is really achievable. Definitely share on the forum if you set something like this up!

Garrett Griffin - Jan 8, 2011:

Thanks for the WordNet advice, I’ve seen that mentioned in a lot of places. I hadn’t found a link as good as that one though! I’m going to read through it in just a bit here, and put a plan in place to put it in action. Thanks!

I cannot sing WordNet’s praises enough. I just got done re-writing parts of my parser to make them compatible with NLTK 2.0 and the WordNet functionality has improved greatly. Definitely worth the time to take a look. I don’t want to think about how much time I wasted, when I first began, reinventing the wheel with word identification, stemming, and conjugation.

 

 
  [ # 9 ]

Garrett,

If you want a simple place to start you might want to try C# and the AIML AAA set.

Based on your skills you may be interested in A C# AIML [url=“http://www.chatbots.org/chatterbot” class=“term”]Chatterbot – Artificial Intelligence In 500 Lines Of Code[/url].

It does not include all the functionality of some AIML interpreters but the code seems pretty straightforward.

The ALICE AAA set can be found here.

 

 
  [ # 10 ]

@Merlin Thanks Merlin, I’ll check it out. I don’t have much experience with C#, but it’s still worth a look.


@CR You’re getting me way too excited about WordNet grin. I’m convinced I’ll be using it, and look forward to putting it in place. I have to take deep breaths now and remember that implementing that into my AI is still far in the future. (darn!)


Chatbot Updates:
I’m going to start experimenting with the Rebecca AIML engine, as I believe it’ll be a great launching point for me.


Chatbot Idea for a Pro:
This “Spokeo.com” message thing going around on Facebook just gave me an idea. While the thing is far from being 100% accurate, it would be a great tool for my Project River AI to use to learn about people. When someone new comes over, the AI could do facial recognition, ask their name, and ask where they live. Then she’d be able to find them on Spokeo, and quickly be able to ask them about their family members and hobbies.

 

 
  [ # 11 ]

Huge thanks to Marcus for the tip on Rebecca AIML. I’m successfully using the engine in my code now, and it’s providing a great start for me.

I’m moving further updates to a thread in “Designing & coding”—http://www.chatbots.org/ai_zone/viewthread/329/.

 

 
  login or register to react