Deep Learning: A practical application

Updated: Feb 15

Using Tensorflow framework to build a chatbot using deep recurrent neural networks (RNN), the Long-Sort Term Memory (LSTM) type.

The Word2Vec model has become a standard method for representing words as dense vectors. This is typically done as a preprocessing step, after which the learned vectors are fed into a discriminative model (typically an RNN) to generate predictions such as movie review sentiment, do machine translation, or even generate text, character by character.

Word vectors

two options: simply use pre-trained vectors, as they are trained on large corpuses for a large number of iterations. However, if you have many words and acronyms that aren’t in typical pre-trained word vector lists, generating our own word vectors Is critical to making sure that the words get represented properly.

To generate word vectors, we use the classic approach of a Word2Vec model. The basic idea is that the model creates word vectors by looking at the context with which words appear in sentences. Words with similar contexts will be placed close together in the vector space.

**Update: I later learned that the Tensorflow Seq2Seq function trains word embeddings from scratch, so I don’t end up using these word vectors, but it was still good practice **

Creating a Seq2Seq Model with Tensorflow

Now that we’ve created the dataset and generated our word vectors, we can move on to coding the Seq2Seq model.

tf-seq2seq is a general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summarization, Conversational Modeling, Image Captioning, and more.

We'll use it for conversational modeling.

By using neural networks with many hidden layers — known as deep learning —, generative chatbot models can build sentences that are completely original rather than retrieved from a list of possible responses.

Unsupervised learning

I created and trained the model in a python script using data of medical conversations by nurse advice hotline. The crux of the model lies in Tensorflow’s embedding_rnn_seq2seq() function.

We can say that, when we move from RNN to LSTM (Long Short-Term Memory), we are introducing more & more controlling knobs, which control the flow and mixing of Inputs as per trained Weights. ... So, LSTM gives us the most Control-ability and thus, Better Results. But also comes with more Complexity and Operating Cost.


Is Lstm unsupervised?

They are an unsupervised learning method, although technically, they are trained using supervised learning methods, referred to as self-supervised. They are typically trained as part of a broader model that attempts to recreate the input

Recent Posts

See All

Fastest computer

The supercomputer — which fills a server room the size of two tennis courts — can spit out answers to 200 quadrillion (or 200 with 15 zeros) calculations per second, or 200 petaflops, according to Oak

Socket Programming in Python

Sockets and the socket API are used to send messages across a network. The network can be a logical, local network to the computer, or one that’s physically connected to an external network like the i

©2020 by Arturo Devesa.