
Using Tensorflow framework to build a chatbot using deep recurrent neural networks (RNN), the Long-Sort Term Memory (LSTM) type.
The Word2Vec model has become a standard method for representing words as dense vectors. This is typically done as a preprocessing step, after which the learned vectors are fed into a discriminative model (typically an RNN) to generate predictions such as movie review sentiment, do machine translation, or even generate text, character by character.
Word vectors
two options: simply use pre-trained vectors, as they are trained on large corpuses for a large number of iterations. However, if you have many words and acronyms that aren’t in typical pre-trained word vector lists, generating our own word vectors Is critical to making sure that the words get represented properly.
To generate word vectors, we use the classic approach of a Word2Vec model. The basic idea is that the model creates word vectors by looking at the context with which words appear in sentences. Words with similar contexts will be placed close together in the vector space.
**Update: I later learned that the Tensorflow Seq2Seq function trains word embeddings from scratch, so I don’t end up using these word vectors, but it was still good practice **
Creating a Seq2Seq Model with Tensorflow
Now that we’ve created the dataset and generated our word vectors, we can move on to coding the Seq2Seq model.
tf-seq2seq is a general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summarization, Conversational Modeling, Image Captioning, and more.
We'll use it for conversational modeling.
By using neural networks with many hidden layers — known as deep learning —, generative chatbot models can build sentences that are completely original rather than retrieved from a list of possible responses.
Unsupervised learning
I created and trained the model in a python script using data of medical conversations by nurse advice hotline. The crux of the model lies in Tensorflow’s embedding_rnn_seq2seq() function.
We can say that, when we move from RNN to LSTM (Long Short-Term Memory), we are introducing more & more controlling knobs, which control the flow and mixing of Inputs as per trained Weights. ... So, LSTM gives us the most Control-ability and thus, Better Results. But also comes with more Complexity and Operating Cost.
RNN LSTM AUTOENCODERS
Is Lstm unsupervised?
They are an unsupervised learning method, although technically, they are trained using supervised learning methods, referred to as self-supervised. They are typically trained as part of a broader model that attempts to recreate the input
Comments