AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Natural Language Generation
 
 

I recently had a user who really engaged with Skynet-AI. She would sign on every day and have a multi-hour conversation. Over the course of a number of weeks she hit the boundaries of the conversational patterns (probably somewhere in the neighborhood of 40 - 50 hrs). Her willing suspension of disbelief and willingness to try to teach the bot as if it were a small child made for some of the most in-depth conversations in Skynet’s 3 year history.

To be able to keep conversations fresh, or to rapidly create new bots that have their own personality, I need to expand JAIL’s (JavaScript Artificial Intelligence Language) Natural Language Generation (NLG) tools. After reading Dave’s post regarding Morgaine, I realized others might be interested in discussing strategies for NLG. Chime in if this topic is of interest to you.

http://en.wikipedia.org/wiki/Natural_language_generation
http://blip.tv/pycon-australia/using-python-for-natural-language-generation-and-analysis-3859677

 

 
  [ # 1 ]

Good to hear Skynet-AI is turning into such a hit !

 

 
  [ # 2 ]

Thanks Victor,
It is apparent that if I want to put multiple bots on the net, having them able to generate their own natural language responses eliminates a lot of overhead. Skynet-AI currently includes multiple NLG tools.

Random Responses - On average each response has a minimum of 4 branches. This is probably the most important part of a bot. My statistics indicate that if a bot repeats a response within 10 volleys, there is a much higher chance the user will sign off. The ability to use in-line random fields adds a multiplier effect for the bot and keeps it fresh. For example:

I am glad you [like|are enjoying] our [time together|discussion|interaction]

Can generate 6 different responses.

Full NLG - In the math module the response is generated on the fly. In the smalltalk module it is also composed on the fly within a set of restricted topics (I hope to beef this up with additional context awareness). The memory module adds learned items and responses on its own.

I am looking to add more flexible NLG methods as I move forward.
 

 

 
  [ # 3 ]

> http://www.meta-guide.com/home/ai-engine/nlg-natural-language-generation

Merlin, you can see my Meta Guide webpage on NLG, above; note links to related Meta Guide pages at bottom.

> http://www.quora.com/Natural-Language-Generation

There is also a link to the NLG topic on Quora; note the relation to the current practice of “autoblogging”, which in turn is related to the new Weavrs.com “infomorphs”, see link below.

> http://en.wikipedia.org/wiki/Infomorph

I would say the most common implementation of “NLG” is probably the plethora of Markovian scramblers.

 

 
  [ # 4 ]

Thanks Marcus,
One of my favorite articles about NLG in action is:
http://www.npr.org/2011/04/17/135471975/robot-journalist-out-writes-human-sports-reporter

I haven’t been impressed with any of the Markovian generators, they tend to end up being too random and fail to hold context. The most famous of these is Mark V. Shaney.

http://en.wikipedia.org/wiki/Mark_V_Shaney
http://en.wikipedia.org/wiki/Markov_chain

http://en.wikipedia.org/wiki/On_the_Internet,_nobody_knows_you’re_a_dog

 

 
  [ # 5 ]

Merlin, there’s a slight problem with that last link, due to a small bug in the forum software here. I tried to “fix” it, but even the little “cheats” that I’ve used in the past don’t help. I’ve got another possible way to correct the issue, though. I just need to get a little creative. smile

[edit] All fixed. And a bug report has been submitted. cheese [/edit]

 

 
  [ # 6 ]

> http://betabeat.com/2012/07/mark-v-shaney-horse-ebooks-markov-chain-twitter-07022012/

Here’s a more recent post about Mark V Shaney, above link.

> http://www.meta-guide.com/home/ai-engine/100-best-autoblogging-videos

100 Best Autoblogging Videos

> http://www.meta-guide.com/home/ai-engine/best-wordpress-autoblog-videos

Best Wordpress Autoblog Videos

> http://www.meta-guide.com/home/ai-engine/best-wordpress-tweetoldpost-videos

Best WordPress Tweet Old Post Videos

> http://www.quora.com/Autoblog

Most autoblogging is happening in the Wordpress world .. and @Weavrs are apparently based on “heavily mutated” Wordpress, called UberStream ..

 

 
  [ # 7 ]

I’ve long wanted to experiment with Markov processes that include linguistic structures such as NP, V, VP, Subject, Verb, Object, instead of simply using single word tokens. Tools such as the lexparser and/or link grammar provide one way of recognizing such linguistic chunks.

(I wish instead of testing us on implementing the CKY algorithm, nlp-class had provided us with simple command-line programs for parsing that I could use to test the idea of using linguistic entities in place of words in Markov chains…I guess the source is there, but too bad the honor code prevents students from sharing their CKY implementations.)

 

 
  [ # 8 ]

Robert,
We can share CYK code on the solutions forum in the NLP class discussion group. I have also been thinking about how to use modified Markov chains to produce better results.

 

 
  [ # 9 ]

Why are pirate technologies like “autoblogging” interesting, and how do they relate to natural language generation??  Autoblogging is closely related to the “robojournalism” of companies such as Narrative Science (@narrativesci) and Automated Insights (@AInsights), mentioned in Merlin’s NPR link above, that are generating prose out of raw data.  So what is the difference between blackhat autoblogging and whitehat robojournalism?  I suggest that the difference is basically Markovian, related to the level of gibberish and its relevancy.

Questions and answers, QA systems, go two ways.  How do you generate blogs automatically, and how do you generate news articles automatically??  Just look at the converse, how we generate answers from questions “asked” of blogs and news….  Not only can we generate answers to questions by “searching” online; but, we should also be able to do the reverse with similar technology, construct textual material, such as blog or news, via dialoguing simply by framing the questions right….  ;^)

 

 
  [ # 10 ]

Black hat autoblogging is like SPAM for search engines, just reposting original content with little value added. The ability to take raw data (especially from the original source) and create quality content should make data retrieval easier and faster. It has applications for content providers and enhanced customer service.

In a bot, you have a set of data/variables stored. Some trigger is tripped during the conversation and the question is, how do you select and package the information in a human like conversational format. Assuming you have perfect knowledge of the users input, what techniques might one use to create a response on the fly? The goal is to write as few canned responses or template fragments as possible and yet still enable broad and dynamic conversation.

 

 
  [ # 11 ]

Let’s get unstuck from the Turing test model for a moment, and look at the mirror image of this story….  Of course we create bots to mimic people.  But what if we created a bot to write essays for us, simply based on our oral outline?  So instead of getting the bot to summarize the web for us, the bot would expound textually on our oral summary….  I think my point is that if we can can use the web to generate human-like responses, in other words answers, why can’t we do it the other way round and just use our human responsiveness to generate web-like results?  This would be like a Turing test based on human products, for instance essays, created from dialog, rather than based on the dialog itself….

 

 
  [ # 12 ]

Skynet-AI includes 2 features that I believe come close to what you are talking about. Try asking it to “write me a story” or “Who are you?”

The AI will compose a unique digital document based on an outline.

 

 
  login or register to react