top of page

More Machine Learning - My Personal Notes

Updated: Apr 19, 2020

Types of machine learning algorithms:

  • supervised learning

  • unsupervised learning

  • reinforcement learning

  • evaluation methods

Deep Learning with Neural Networks:

  • Neurons = Perceptrons

  • Activation Functions

  • Cost Functions

  • Gradient Descent Backprogagation

  • Deep neural networks: Tensorflow and Pytorch for Word embeddings Word2Vec, Sequence to Sequence Seq2Seq

Unlike typical computer programs, machine learning techniques will literally learn from data that is the same. With new additional data, the learning algorithms will self-learn.

Machine learning algorithms can actually find insights and data even if they aren't specifically instructed what to look for in that data. That's what separates a machine learning algorithm from a typical computer program.

You're just giving the machine learning algorithm a set of rules to follow instead of actually telling it what to look for; it will find the insights on its own.

Supervised Learning

Supervised learning uses labelled data to predict the label given some features.

The fact that the data is labeled so whenever you think of supervised learning think label

If the labels are continuous it's called the regression problem, and if it's categorical it's called a classification problem.


You'll have some features such as height and weights and the categorical label could be something like gender.

So then your task could be given a person's height and weight, predict their gender.

Well for instance we could just plot out a couple of points here.

Remember since this is supervised learning and classification we already know the labels. In this case our labels are male and female genders and we have height in weight as our features.

So for a classification task our model ends up being trained on some training data here then in the future we'll get a new point who features we do know such as we know the weight and the height but we don't know what class (male or female) it belongs to then our machine learning algorithm will predict according to what it's been trained on what class it should be.


This is again a supervised learning technique because it has a given label based off historical values.

Now the only difference here is that the label instead of being categorical such as male and female. it's continuous such as the house price.

So in this case we'll have a dataset with features such as the square footage of a house. how many rooms it has etc. and we need to predict some continuous values such as house price so that when the task is given a house size and the number of rooms, we predict the selling price of the house.

We have a price and let's say square feet.

So here only using one feature.

So on the x axis we have our feature the square feet of the house indicating how big the house is and

then on the y axis we have the actual label that we're trying to predict.

And in this case the label is continuous because it can't be split up into categorical units instead of it's continuous value.

So your model will end up creating some sort of fit to the data.

In this case it kind of has a trend here that the larger the houses the higher in price.

So then when you get a new house whose price you don't know but you do know it's features such as the square footage of the house you end up checking out your model and it returns back it's predicted price.

Once the model is trained on that historical data it can then be used on new data or only the features are known to attempt prediction.

Unsupervised Learning

What if you don't have historical labels for your data you only have features since you technically have no correct answer to fit on,you have no label.

You actually need to look for patterns in the data and find the structure.

And this is known as an unsupervised learning problem because you don't actually have the labels.

It really common unsupervised learning task is called clustering where you're given data with just the features no labels and your task is to cluster into similar groups.

Reinforcement Learning

computer learning to play a video game or drive a car etc. and that sort of reinforcement learning

comes into play.

Reinforcement learning works through trial and error which actions yield the greatest rewards.

Machine Learning Process

Data we use is split into training data and test data. The training set contains a known output and the model learns on this data in order to be generalized to other data later on. We have the test dataset (or subset) in order to test our model’s prediction on this subset.

Train model with a Test/Training split of data. 30-70 split for example.

Model Evaluation

Supervised learning-classification evaluation metrics

  1. accuracy, recall and precision

  2. accuracy is is the number of correctly classified samples divided by the total number of samples given to the actual model.

  3. MAE, MSE, RMSE, on average, how far model is from correct continuous value

Unsupervised learning Evaluation

  1. Harder to evaluate

  2. Never really had the correct labels to compare it to.

  3. You can use things like cluster homogeneity, Ranh index to evaluate your unsupervised learning model; auto encoders.


NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy arrays is the main way to use the library. Two flavors of arrays:

Vectors and Matrices.

Vectors are 1-d arrays and Matrices are 2-d arrays.

NumPy is not just more efficient; it is also more convenient. You get a lot of vector and matrix operations for free, which sometimes allow one to avoid unnecessary work. And they are also efficiently implemented.


Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

Neural Networks

What is a neural network?

AI may have come on in leaps and bounds in the last few years, but we’re still some way from truly intelligent machines – machines that can reason and make decisions like humans. Artificial neural networks (ANNs for short) may provide the answer to this.

Human brains are made up of connected networks of neurons. ANNs seek to simulate these networks and get computers to act like interconnected brain cells, so that they can learn and make decisions in a more humanlike manner.

Different parts of the human brain are responsible for processing different pieces of information, and these parts of the brain are arranged hierarchically, or in layers. In this way, as information comes into the brain, each level of neurons processes the information, provides insight, and passes the information to the next, more senior layer. For example, your brain may process the delicious smell of pizza wafting from a street café in multiple stages: ‘I smell pizza,’ (that’s your data input) … ‘I love pizza!’ (thought) … ‘I’m going to get me some of that pizza’ (decision making) … ‘Oh, but I promised to cut out junk food’ (memory) … ‘Surely one slice won’t hurt?’ (reasoning) ‘I’m doing it!’ (action).

It’s this layered approach to processing information and making decisions that ANNs are trying to simulate. In its simplest form, an ANN can have only three layers of neurons: the input layer (where the data enters the system), the hidden layer (where the information is processed) and the output layer (where the system decides what to do based on the data). But ANNs can get much more complex than that, and include multiple hidden layers. Whether it’s three layers or more, information flows from one layer to another, just like in the human brain.

What is deep learning?

Deep learning represents the very cutting edge of artificial intelligence (AI). Instead of teaching computers to process and learn from data (which is how machine learning works), with deep learning, the computer trains itself to process and learn from data.

This is all possible thanks to layers of ANNs. Remember that I said an ANN in its simplest form has only three layers? Well an ANN that is made up of more than three layers – i.e. an input layer, an output layer and multiple hidden layers – is called a ‘deep neural network’, and this is what underpins deep learning. A deep learning system is self-teaching, learning as it goes by filtering information through multiple hidden layers, in a similar way to humans.

As you can see, the two are closely connected in that one relies on the other to function. Without neural networks, there would be no deep learning.

  1. Neurons and Activation Functions

  2. Cost Functions

  3. Gradient Crescent

  4. Backpropagation

Single components of a NN, a single neuron.

Artificial Neural Networks are based on Biology. Mimic biological neurons with an artificial neuron known as a perceptron.

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class.

Biological Neuron:

Dendrites --> Body --> Axons

Artificial Neuron also has outputs and inputs. This simple model is known as an artificial neuron called the perceptron:

Mathematical representation of the perceptron/neuron:

Activation function is a simple function that output 1 or 0.

Neural Networks Activation Functions

First create a NN by connecting many perceptrons together.

  1. Input layer- Real value of data.

  2. Hidden layers - 2 hidden layers. Note, 3 or more is "Deep". As you go forward to more layers, the level of abstraction increases.

  3. Out layer - Final estimate of the output

There can be many more levels of abstractions and types of NN for different end goals:

Activation Functions:

z = wx+b

  1. Sigmoid function to represent 1 or 0 output.

  2. Hyperbolic Tangent tanh(z)

  3. Rectified Linear Unit (ReLU) max(0,z)

Deep Learning libraries have tanh and ReLu built-in for us.

Changing the activation function can be beneficial depending on problem.

Cost Functions

Allows to measure how well these neurons are performing. How far off are we from expected value?

Two variables:

y to represents true value

a to represent neuron's prediction

In terms of weights and bias:


Pass z into activation function f(z)=a

Last note:

“Finance is of minuscule importance in the grand scheme of things,” he said. “There are far more interesting problems that can be solved with machine learning, from healthcare, to drug discovery, automated driving, robotics, and energy. The roles of doctors, lawyers, flow traders, middle management, accountants, economists, may be redefined or augmented with machine learning algorithms, a single professional worker in five years may have the productivity output of an entire team at present.” - Google Brain

Recent Posts

See All

Generative AI report

Top GenAI companies: OpenAI Google Anthropic Meta Mistral Stability AI MidJourney Top GenAI Models GPT 4 Gemini 1.5 Llama2 Mistral Claude Stable Diffusion


bottom of page