Machine Learning for dummies 101

6 min readApr 9, 2021

Can you explain to me how a neural network works?

YES and NO. Yes but it is not one of those topic which u can learn in a 15 minutes reading. So this will be a Yes(only overview).

Machine learning is the science that create software that can learn from previous experiences, and a artificial neural network (ANN) is a imitation game.

Sanity check — Do you understand all the words in this image? Can you add a few more words to this soup?

If YES, Then you won’t need to read these article.
If NO, please continue.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

In the movie, The Imitation Game, “Alan Turing” realize that the only way he could break a cipher generate by a machine, was with another machine.

This is the machine that help Alan, break German cipher messages.

And some scientist realize that the only way we the humans can make a computer do certain task are by imitating nature, in this case imitating biological neurons.

What we can do with ML?

Does your bank app or ATM can read your checks amounts?
This code can detect lanes in the street.
This code can transfer artistic styles to photos “Neural Style”.
This code can search for a face on a photo.

All of this tasks and more thanks to machine learning or neural networks... A new programming paradigm in which you don't code a discrete value (1 for TRUE or 0 for FALSE) you code with probabilities, I believe this is the face of JOHN DOE I'm 94% sure..

How does a neural network learn?

Some calculus trick in which you don’t have to search all the possibly numbers to find the correct solution, you take a “shortcut” to the solution, doing what is call back propagation.

Why I see different terminology on different guides?

I’m not sure about it, but machine learning is a science in which you join multiple disciplines. So I guess that’s the real reason behind it, the wording will get you a little bit crazy!

Transfer function = Activation Function, Squash Function
Learning Rate = ETA, alpha, epsilon
Neurons = Nodes
Summation Junction = Net Input, neuron activation, pre-activation, dot product, inner product, X = W * I
Observation = Record, Sample, Instances
Features = attributes, measurements, dimensions
Loss function = cost function, objective function, error function
Class labels = targets

Be careful is easy to get lost when reading different guides from different authors.

Yo say that ML is a mixture of multiple disciplines, which ones?

Statistics, Multi-variable Calculus, Algebra, Programming, Data science.

Why MNIST? (Modified NIST data-set)

MNIST is like the Hello World! in programming. Well understood data-set that you can use always to carry experiments, and play with it.

Call yourself a mechanic, and you should be expected to know how to change a car battery. Call yourself a data-scientist, and you should be expected to know how to work with the MNIST data-set and others at minimum.

What is SckitLearn?

Its a collection of multiple machine learning algorithms, like Naive Bayes, Decision Trees, Support Vector Machines, and the like. You can use for example Naive Bayes and create a email SPAM Filter, it is just that usually the neural networks outperform all those machine learning algorithms.

Why the Python language?

When all the hype of the neural networks begin, it was the best tool for the job. that’s it. Python have great support for matrix operations, using the numpy library, and neural networks are matrix operations. I have seen other languages trying to catch up, and yes you can use other langues and make neural networks, but the community support is on python.

Whats all the hype with the GPU?

Big ANN takes times to train, and the GPU cut that time. You don’t want to wait 7 days to obtain a result? Right? You want it now! So google develop their own hardware (Tensor Processing Unit) for this purpose. The metric they use is processing power per watt.

sckit-learn won’t give you support for GPU.
tensor flow will give you support for multiple GPU, not just one.

Image representation of a neural network?

Each link have some “synapses weights” that carry a number, usually from 0.01 to 0.99 and the goal is to search for the numbers that will give you the correct answer at the end of the output layer with respect to the input layer.

The first layer don’t carry an activation function, but the hidden and final layers will have one
The output layer carry a loss function
You can have multiple hidden layers
The bias is only for the input layers, and hidden layers not for the Output Layer. Sometimes there’s no need to use a bias in the ANN.
The value of the bias will be a constant, while the values of the link in the bias will be the one changing with the training.
The job of the activation function is to generate a non-linear boundary in the decision map.
In a real scenario NOT all the links, will be used by the ANN, some links will become zero meaning that there’s really no active link between one neuron and the other, even through we represent the link on the above graphic.
In the case of Sigmoid, the output will be always between 0 and 1.
The rule of thumb is to always start with the RELu activation function.
Yes, you can mix activation functions on a ANN.

**w_ih** = weights from input to hidden, **w_ho** = weights from hidden to output.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

Lets take for example the number 8. In this Black and White image.

The neural network, see the numbers 8 as a matrix fill with numbers, 0 is White and 255 is Black.

What stuff can be done using Neural Networks?

And Yes I know that Machine Learning is not the same as a Neural Network, or the same as a Deep Neural Network or Multi-layer Perceptron, but for the purpose of the article it is.. See you the next time, leave a clap if you want more article like this one.

Machine Learning for dummies 101

Written by Edgardo Trujillo