| |
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
I recently had a user who really engaged with Skynet-AI. She would sign on every day and have a multi-hour conversation. Over the course of a number of weeks she hit the boundaries of the conversational patterns (probably somewhere in the neighborhood of 40 - 50 hrs). Her willing suspension of disbelief and willingness to try to teach the bot as if it were a small child made for some of the most in-depth conversations in Skynet’s 3 year history.
To be able to keep conversations fresh, or to rapidly create new bots that have their own personality, I need to expand JAIL’s (JavaScript Artificial Intelligence Language) Natural Language Generation (NLG) tools. After reading Dave’s post regarding Morgaine, I realized others might be interested in discussing strategies for NLG. Chime in if this topic is of interest to you.
http://en.wikipedia.org/wiki/Natural_language_generation
http://blip.tv/pycon-australia/using-python-for-natural-language-generation-and-analysis-3859677
|
|
|
|
|
| |
Posted: Jul 2, 2012 |
[ # 1 ]
|
|




Guru
Total posts: 1004
Joined: Oct 21, 2009
|
Good to hear Skynet-AI is turning into such a hit !
|
|
|
|
|
| |
Posted: Jul 3, 2012 |
[ # 2 ]
|
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
Thanks Victor,
It is apparent that if I want to put multiple bots on the net, having them able to generate their own natural language responses eliminates a lot of overhead. Skynet-AI currently includes multiple NLG tools.
Random Responses - On average each response has a minimum of 4 branches. This is probably the most important part of a bot. My statistics indicate that if a bot repeats a response within 10 volleys, there is a much higher chance the user will sign off. The ability to use in-line random fields adds a multiplier effect for the bot and keeps it fresh. For example:
I am glad you [like|are enjoying] our [time together|discussion|interaction].
Can generate 6 different responses.
Full NLG - In the math module the response is generated on the fly. In the smalltalk module it is also composed on the fly within a set of restricted topics (I hope to beef this up with additional context awareness). The memory module adds learned items and responses on its own.
I am looking to add more flexible NLG methods as I move forward.
|
|
|
|
|
|
| |
Posted: Jul 3, 2012 |
[ # 4 ]
|
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
|
|
|
|
|
| |
Posted: Jul 3, 2012 |
[ # 5 ]
|
|




Administrator
Total posts: 1890
Joined: Jun 13, 2010
|
Merlin, there’s a slight problem with that last link, due to a small bug in the forum software here. I tried to “fix” it, but even the little “cheats” that I’ve used in the past don’t help. I’ve got another possible way to correct the issue, though. I just need to get a little creative. 
[edit] All fixed. And a bug report has been submitted. [/edit]
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 6 ]
|
|



Senior member
Total posts: 294
Joined: Oct 3, 2008
|
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 7 ]
|
|



Senior member
Total posts: 141
Joined: Oct 30, 2010
|
I’ve long wanted to experiment with Markov processes that include linguistic structures such as NP, V, VP, Subject, Verb, Object, instead of simply using single word tokens. Tools such as the lexparser and/or link grammar provide one way of recognizing such linguistic chunks.
(I wish instead of testing us on implementing the CKY algorithm, nlp-class had provided us with simple command-line programs for parsing that I could use to test the idea of using linguistic entities in place of words in Markov chains…I guess the source is there, but too bad the honor code prevents students from sharing their CKY implementations.)
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 8 ]
|
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
Robert,
We can share CYK code on the solutions forum in the NLP class discussion group. I have also been thinking about how to use modified Markov chains to produce better results.
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 9 ]
|
|



Senior member
Total posts: 294
Joined: Oct 3, 2008
|
Why are pirate technologies like “autoblogging” interesting, and how do they relate to natural language generation?? Autoblogging is closely related to the “robojournalism” of companies such as Narrative Science (@narrativesci) and Automated Insights (@AInsights), mentioned in Merlin’s NPR link above, that are generating prose out of raw data. So what is the difference between blackhat autoblogging and whitehat robojournalism? I suggest that the difference is basically Markovian, related to the level of gibberish and its relevancy.
Questions and answers, QA systems, go two ways. How do you generate blogs automatically, and how do you generate news articles automatically?? Just look at the converse, how we generate answers from questions “asked” of blogs and news…. Not only can we generate answers to questions by “searching” online; but, we should also be able to do the reverse with similar technology, construct textual material, such as blog or news, via dialoguing simply by framing the questions right…. ;^)
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 10 ]
|
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
Black hat autoblogging is like SPAM for search engines, just reposting original content with little value added. The ability to take raw data (especially from the original source) and create quality content should make data retrieval easier and faster. It has applications for content providers and enhanced customer service.
In a bot, you have a set of data/variables stored. Some trigger is tripped during the conversation and the question is, how do you select and package the information in a human like conversational format. Assuming you have perfect knowledge of the users input, what techniques might one use to create a response on the fly? The goal is to write as few canned responses or template fragments as possible and yet still enable broad and dynamic conversation.
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 11 ]
|
|



Senior member
Total posts: 294
Joined: Oct 3, 2008
|
Let’s get unstuck from the Turing test model for a moment, and look at the mirror image of this story…. Of course we create bots to mimic people. But what if we created a bot to write essays for us, simply based on our oral outline? So instead of getting the bot to summarize the web for us, the bot would expound textually on our oral summary…. I think my point is that if we can can use the web to generate human-like responses, in other words answers, why can’t we do it the other way round and just use our human responsiveness to generate web-like results? This would be like a Turing test based on human products, for instance essays, created from dialog, rather than based on the dialog itself….
|
|
|
|
|
| |
Posted: Jul 4, 2012 |
[ # 12 ]
|
|



Senior member
Total posts: 667
Joined: Dec 16, 2010
|
Skynet-AI includes 2 features that I believe come close to what you are talking about. Try asking it to “write me a story” or “Who are you?”
The AI will compose a unique digital document based on an outline.
|
|
|
|
|