AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Natural language decomposition
 
 

Hi,

I’m writing an irc chatbot and currently it will only reply when spoken to. Regardless of the message it receives it will respond with random Markov chain nonsense (for now these chains are based on letters, not words). So the easy work is done.

What I would like to add is some relevance in the response to the message that the bot received. There are a lot of papers to be found on the web about parse trees but I understand none of them in away that I can derive a concrete algorithm from them.

My design goal is to make this unsupervised. So I don’t want to add a hard coded response to “What is your name?” for example. All learning should be derived from whatever book Project Gutenberg has to offer.

Possibly I could extract one or more words from the message sent to the bot and base the response on that but I don’t know how to determine the relevance (subject for example) of the input message.

Are there any cut-and-dry recources that you know of on how to go about this? Or other strategies achieve the same goal?

Thanks

 

 
  [ # 1 ]

The most similar activity to what you are talking about are the writings of “Mark V Shaney” a fake Usenet user whose postings were generated by using Markov chain techniques.

Background and source links = http://en.wikipedia.org/wiki/Mark_V_Shaney


Other resources that may be helpful:
The Natural Language Toolkit, http://nltk.org/
Natural Language Processing with Python provides a practical introduction to programming for language processing, https://sites.google.com/site/naturallanguagetoolkit/Home

 

 
  [ # 2 ]

> http://en.wikipedia.org/wiki/Decomposition_(computer_science)

“a complex problem or system is broken down into parts that are easier”

In the “other strategies” category, I’ve been experimenting with a system that dispenses with both “database” and “interpreter” altogether.  There are now many web APIs that can do named-entity recognition (NER), some better and some worse. 

I’ve got a working prototype that extracts questions directly from the Twitter API, extracts named-entities, and then searches the Twitter API for likely candidate replies.  The replies appear to come from a “virtual agent”, however are linked back to the original tweet, in effect connecting the “questioner” with the “answerer”.

This technique is vaguely reminiscent of Omegle or the infamous AIM bot, TheGreatHatsby, as well as the so-called Fish Bots, that pair up users in various ways.

> http://en.wikipedia.org/wiki/Omegle

> http://web.archive.org/web/20081220145156/http://en.wikipedia.org/wiki/TheGreatHatsby

 

 
  [ # 3 ]

Thanks you both for these links. Especially the nltk documentation looks like a valuable resourse for what I want to achieve.

 

 
  [ # 4 ]

If nltk looks promising, you may also want to take a look at the Nodebox|Linguistics tools available here: http://nodebox.net/code/index.php/Linguistics

 

 
  login or register to react