AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

A Neural Conversational Model - google researcch

They used sequence to sequence framework to produce conversations.


  [ # 1 ]

> Abstract

Conversational modeling is an important task in natural language understanding and machine intelligence. Although previous approaches exist, they are often restricted to specific domains (e.g., booking an airline ticket) and require handcrafted rules. In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework. Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation. The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules. We find that this straightforward model can generate simple conversations given a large conversational training dataset. Our preliminary suggest that, despite optimizing the wrong objective function, the model is able to extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning. As expected, we also find that the lack of consistency is a common failure mode of our model.

> seq2seq framework



  [ # 2 ]

Some of the conversations seem pretty good…

Here is a paper about the sequence to sequence framework.


  [ # 3 ]

> Abstract

Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT’14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.8 on the entire test set, where the LSTM’s BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which is close to the previous best result on this task. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM’s performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.



  [ # 4 ]

They claim that “the model can remember facts, understand contexts, perform common sense reasoning”. However, The conversation 7 shows the engine has no consistent knowledge for itself still.

Human: what is your job ?
Machine: i ’m a lawyer .
Human: what do you do ?
Machine: i ’m a doctor


  [ # 5 ]

Looks pretty cool, but a huge drawback is how to score the model performance, “manual inspection” is obviously the least efficient and at best impractical:

Google - Jun 23, 2015:

An outstanding research problem is on how to objectively
measure the quality of models. We did so by manual inspection
and by computing the perplexity



  [ # 6 ]

I’m sorry to say I don’t have the time to read papers. Would I be very incorrect if I thought this sounded similar to N-grams but with sentences instead of words? Like Cleverbot’s remembering question-answer pairs without knowing what the words in the text represent?


  [ # 7 ]

Hi, interesting articles!

I read all of the papers thoroughly and think I have somehow understood them, also found interesting clues inside, the only and big problem resides in that the practical use of this is forbidden for us, simple mortals (with lean pockets). 

The underlaying algorithms are too complex and not fully disclosed, the optimization to allow all to run flawlessy is never disclosed, and also the computer power needed to perform a simple concept-test is unreachable for a simple bot-programmer, so we should continue playing our clever-games until they really got a beaktroug and wipe us down in a breeze!

I guess the training necessary for a Deep Neural Network is beyond our reach, I tested some time ago a 10000x5000 matrix SVD decomposition used for simple Q&A using ANN’s SVM and more, and the calculus time was overwhelming, more than several minutes, with my 8 Gigs of RAM & dual pentium G2020 (have not used CUDA cores) it worked for many hours every time, sometimes throwing out-of memory exceptions! its not practical for producton (at least for me). Those are deterministical, as ANN training is hundred of times slower than this because the epocs are not so fast converging at all!

Most of the open source libraries out there are built in python and they are very slow in nature, many times slower than compiled C++ or even C# or Java code. So they are impractical for production, in my modest opinion.

Actually I am using successfully many concepts for Vector Space modelling based upon deep lemmatization, spell correction and inferencing parasynthetic words, in order to allow my bot-platform able find relations for unstructured text, like to find wether a sentence asked by a human belongs to certain context or another, using SVM and >5000 dimensions is not enough, the precision actually is over 93% (depending upon dataset) but the quality is not enough to get a decent recall. I ended up using a combined LSA + SVD to find the similarity and use the SVM to make multipattern calssification, the combined stuff is really cool adn gives more than 93% F-Score for 30 categories Q&A, quite enough for me!





  [ # 8 ]

Nonetheless,  one drawback of this basic model is that it only gives simple, short, sometimes unsatisfying answers to our questions as can be seen above.  Perhaps a more problematic drawback is that the model does not capture a consistent personality.
Indeed, if we ask not identical but semantically similar questions, the answers can sometimes be inconsistent.  This is expected due to the simplicity of our model and the dataset in our experiments

This technique suffers from some of the same limitations as Cleverbot and other batch unsupervised learning methods. You can’t expect a personality to be developed without some kind of botmaster intervention. You might be able to answer factoid questions, but there may be simpler methods to do the same thing.


  [ # 9 ]

Anything that (still) uses a basic question-answer model (which simply is NOT a real conversation) is largely uninteresting to me. Mainly because this implicates the use of rather basic models for cognition that, to my opinion, will never scale to something like AGI. Just my thought on the matter.


  [ # 10 ]

If anyone is interested, checkout my seq2seq implementation in python (using keras deeplearning library


  [ # 11 ]

@Nathan Hu… the paper states that a consistent voice is a problem.  If you had training data with a consistent voice, it would likely solve that issue.

@Fariz, that is awesome… I will check out your implementation.  when do you you plan to test it with a real dataset?

Also… Google has released a new deep learning library, tensor flow, that can be used to create a seq2seq implementation and a chatbot.

Also, I recently compiled a bunch of papers related to deep learning, word vectors, thought vectors and dialog systems.


  [ # 12 ]

Andrew, any chance you could blog a practical, step by step tutorial on how to make a chatbot with sequence-to-sequence in TensorFlow?  Or, even just a rave about it on YouTube?  I could see two tracks, 1) a low level hands on, and 2) a high level conceptual overview.  At the moment, I’m not seeing either clearly.


  [ # 13 ]

I was wondering if someone was going to bring up googles new opensource AI Tensorflow. I personally think any new algorithm that lets AI think more like a human is better then tossing facts at a Chatbot and expect it to make any kind of sense. Many chatbots tend to end up sounding too much like a computer when that is done. It also does not necessarily embed any kind of common to it either.

The art of conversation is by its very nature hard to pin down as there numerous variables, language, dialectic, even a kind of personality. Anyone who has tried to create more then one kind of chatbot soon discovers that one size does not fit all.  So i personally welcome the addition of Tensorflow.


  [ # 14 ]

@sheryl Tensorflow is not an AI, nor an algorithm! Its a framework for developers, which we use for tensor calculus. Tensorflow has many examples like machine translation etc to show off its capabilities.. but what it does basically is tensor calculus. This is not a great news because there are many tensor frameworks, and some of them have been around for a long time(theano is a good example). My point is that google open sourcing its tensor library does not mean anything to the chatbot community. In fact, most of the developers are reluctant to switch to Tensorflow from theano (me for instance).

Seq2seq and chatbots: Using seq2seq alone for a chatbot would be the most stupid way to make a chatbot. The paper indeed has academic importance as it demonstrates that a direct sequence to sequence mapping could be learned across two domains in an end to end manner, but again, this is not a great news for the more practical chatbot master.

Seq2seq is good at translating something from one language to another, and when you train it over conversations, you are essentially considering “questions” as one language and “answers” as another and asking your system to “translate” between these 2 languages. Such a system would never make a good chatbot.

What I find the most interesting in seq2seq is that the hidden state from the encoder is transferred to the decoder. They should have done so because they find it more efficient than the alternatives(such as in cho et al.). This finding, on the other hand, is practically useful knowledge for a chatbot programmer who relies on neural networks.


  [ # 15 ]

@Marcus…  I may post something soon…  the key to training a neural net is to have good training data.  I am at a loss for good training data for a dialog system.  Vinyols and Le at Google used movie dialog from open subtitles.  They also used records from a google internal tech support chat log… which is not publicly available.  I am curious if anyone here knows a good source…  what you need is conversations.. with millions of words. and a consistent voice. 

@Fariz, I agree that a seq2seq chatbot would be limited and only good for single sentence replies.  What are your thoughts on the most promising path to building a deep learning chatbot and the limitations on seq2seq?


 1 2 > 
1 of 2
  login or register to react