fbpx
Sigmoidal
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
  • Português
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
Sigmoidal
No Result
View All Result

Building a Deep Learning Neural Network using TensorFlow

Carlos Melo by Carlos Melo
March 31, 2023
in Deep Learning
0
252
SHARES
8.4k
VIEWS
Share on LinkedInShare on FacebookShare on Whatsapp

In this tutorial, you will learn how to implement Multilayer Neural Networks using Python and the Deep Learning library Keras, one of the most popular libraries currently.

Keras is a Neural Network library that can run with TensorFlow (not only) and was developed to enable easy and fast prototyping. Multilayer Neural Networks are those in which neurons are structured in two or more layers (layers) of processing (since at least there will be 1 input layer and 1 output layer).

.

Implementing a complete Neural Network architecture from scratch is a Herculean task, requiring a more solid understanding of Programming, Linear Algebra, and Statistics – not to mention that the computational performance of your implementation will hardly match the performance of famous libraries in the community.

To show that with a few lines of code, it is possible to implement a simple Network, let’s take the well-known MNIST classification problem and see the performance of our algorithm when submitted to this dataset with 70,000 images!

If you want, all the code will be available on my Github!

Neural Network Architecture

Neural Networks are modeled as a set of neurons connected as an acyclic graph. What does this mean in practice? It means that the outputs of some neurons will be the inputs of other neurons.

The most commonly found Neural Networks out there are those organized into distinct layers (layers), each layer containing a set of neurons. The most commonly found layer type is the fully-connected layer. In this type, neurons between two adjacent layers connect two by two.

This type of architecture is also known as Feedforward Neural Network because only a neuron from the layer $l_i$ is allowed to connect to a neuron from the layer $l_{I+1}, as illustrated in the figure below.

What is MNIST

MNIST is a dataset containing thousands of handwritten images of digits 0-9. The challenge in this dataset is, given any image, to apply the corresponding label (correctly classify the image). MNIST is so studied and used by the community that it acts as a benchmark to compare different image recognition algorithms.

MnistExamples

The complete dataset consists of 70,000 images, each 28 x 28 pixels in size. The figure above shows some random examples from the dataset for each of the possible digits. It should be noted that the images are already normalized and centered. Since the images are in grayscale, i.e., they have only one channel, the value corresponding to each pixel in the images should vary within the range [0, 255].

Multilayer Neural Networks are those in which neurons are structured in two or more layers (layers) of processing.
Ian Goodfellow

MNIST in Python

As it is so widely used, the MNIST dataset is already available within the scikit-learn library and can be imported directly by Python with fetch_mldata("MNIST Original"). To illustrate how to import the complete MNIST and extract some basic information, let’s run the code below:

# Importing the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist

# Loading the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Checking the shape of the data
print("Train Images Shape:", train_images.shape)
print("Train Labels Shape:", train_labels.shape)
print("Test Images Shape:", test_images.shape)
print("Test Labels Shape:", test_labels.shape)

# Visualizing an example image
example_image = train_images[0]
plt.imshow(example_image, cmap='gray')
plt.title(f"Label: {train_labels[0]}")
plt.show()

Above, we can see that the array containing the images indeed has 70,000 rows (one row for each image) and 784 columns (all the 28 x 28 pixels of the image). We can also see one of the randomly chosen images. In the case of this digit, our algorithm would have been successful if it could correctly classify the digit as ‘4’.

Implementing our Neural Network with Python and TensorFlow (Keras)

After a brief introduction to Neural Networks, let’s implement a Feedforward Neural Network for the MNIST classification problem. Create a new file in your favorite IDE, named rede_neural_keras.py, and follow the steps of the code below.

# import necessary packages
from sklearn.datasets import fetch_mldata
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.preprocessing import LabelBinarizer
from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import SGD
import numpy as np
import matplotlib.pyplot as plt

Above, we have imported all the necessary packages to create a simple Neural Network with the Keras library. If you encountered an error while trying to import the packages, or do not have a dedicated Python virtual environment for working with Deep Learning/Computer Vision, I recommend looking for a tutorial based on your Operating System.

# import MNIST
print("[INFO] importing MNIST...")
dataset = fetch_mldata("MNIST Original")

# normalize all pixels, so that the values are
# in the range [0, 1.0]
data = dataset.data.astype("float") / 255.0
labels = dataset.target

After importing the set of images, I will divide it into a training set (75%) and a test set (25%), a well-known practice in the Data Science universe. But be careful! The training and test sets MUST BE INDEPENDENT to avoid various problems, including overfitting.

Although it seems complicated, this can be done with just one line of code, thanks to the scikit-learn library, which can be done easily with the train_test_split method. At this stage of preparing our data, we will also need to convert the labels (which are represented by integers) to binary vector format. To illustrate what a binary vector is, see the example below, which indicates the label 4.

4 = [0,0,0,0,1,0,0,0,0,0]

In this vector, the value 1 is assigned to the index corresponding to the label and the value 0 to the others. This operation, known as one-hot encoding, can also be done easily with the LabelBinarizer class.

# split the dataset into train (75%) and test (25%)
(trainX, testX, trainY, testY) = train_test_split(data, dataset.target)

# convert labels from integers to vectors
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)

Great! With the dataset imported and processed correctly, we can finally define the architecture of our Neural Network with Keras. Arbitrarily, I defined that the Neural Network will have 4 layers:

  1. Our first layer (l0) will receive as input the values ​​related to each pixel of the images. So, since each image has a size equal to 28 x 28 pixels, l0 will have 784 neurons.
  2. The hidden layers l1 and l2 will arbitrarily have 128 and 64 neurons.
  3. Finally, the last layer, l3, will have the number of neurons corresponding to the number of classes in our classification problem: 10 (remember, there are 10 possible digits).
# define the Neural Network architecture using Keras
# 784 (input) => 128 (hidden) => 64 (hidden) => 10 (output)
model = Sequential() model.add(Dense(128, input_shape=(784,), activation="sigmoid")) model.add(Dense(64, activation="sigmoid")) model.add(Dense(10, activation="softmax"))

Within the concept of feedforward architecture, our Neural Network is instantiated by the Sequential class, which means that each layer will be “stacked” on top of another, with the output of one being the input of the next. In our example, all layers are of the fully-connected layer type.

The hidden layers will be activated by the sigmoid function, which takes the real values of the neurons as input and places them within the range [0, 1]. For the last layer, which needs to reflect the probabilities for each of the possible classes, the softmax function will be used.

To train our model, I will use the most important algorithm for Neural Networks: Stochastic Gradient Descent (SGD). I want to make a dedicated post about SGD in the future (mathematics + code), given its importance! But for now, let’s use the algorithm already available in our libraries. The learning rate of SGD will be set to 0.01, and the loss function will be categorical_crossentropy since the number of output classes is greater than two.

# train the model using SGD (Stochastic Gradient Descent)
print("[INFO] training the neural network...")
model.compile(optimizer=SGD(0.01), loss="categorical_crossentropy",
metrics=["accuracy"])

H = model.fit(trainX, trainY, batch_size=128, epochs=10, verbose=2,
validation_data=(testX, testY))

The power of Deep Learning comes primarily from a single, very important algorithm: Stochastic Gradient Descent (SGD).

By calling model.fit, the training of the neural network begins. After a time that varies depending on your machine, the weights of each node are optimized, and the network can be considered trained.

To evaluate the performance of the algorithm, we call the model.predict method to generate predictions on top of the test dataset. The model’s challenge is to make predictions for the 17,500 images that make up the test set, assigning a label from 0-9 to each one:

# evaluate the Neural Network
print("[INFO] evaluating the neural network...")
predictions = model.predict(testX, batch_size=128)
print(classification_report(testY.argmax(axis=1), predictions.argmax(axis=1)))

Finally, after obtaining the performance report, we will want to plot the accuracy and loss over the iterations. Visually analyzing allows us to easily identify situations of overfitting, for example:

# plot loss and accuracy for the 'train' and 'test' datasets
plt.style.use("ggplot") plt.figure() plt.plot(np.arange(0,100), H.history["loss"], label="train_loss") plt.plot(np.arange(0,100), H.history["val_loss"], label="val_loss") plt.plot(np.arange(0,100), H.history["acc"], label="train_acc") plt.plot(np.arange(0,100), H.history["val_acc"], label="val_acc") plt.title("Training Loss and Accuracy") plt.xlabel("Epoch #") plt.ylabel("Loss/Accuracy") plt.legend() plt.show()

Executing the Neural Network

With the code ready, you just need to execute the command below to see our Neural Network built on top of the Keras library in full operation:

$ python neural_network_keras.py

As a result, the classification_report shows that after 100 epochs, the network achieved an accuracy of 92%, which is a good result for this type of architecture. Just out of curiosity, Convolutional Neural Networks have the potential to reach up to 99% accuracy (!):

Obviously, there are many improvements that can be made to enhance the performance of our network, but even a simple architecture already delivers excellent performance.

Looking at the graph below, note how the curves related to the training and validation datasets are practically overlapping. This is a great indication that there were no overfitting problems during the training phase.

Summary

Well, reaching the end of the post, the basic concepts of Neural Networks were introduced, as well as the MNIST dataset, widely used to benchmark algorithms. By testing the performance of a neural network with 4 layers (input+2 hidden layers + output), we managed to achieve 92% accuracy in the predictions made.

The implementation was done using Keras, showing that with just a few lines of code, it is possible to build great classification models.

Finally, I would like to say that I intend to produce articles not only focused on code writing/implementation, but also delving deeper into the theoretical and mathematical concepts behind machine learning algorithms and methods.

I strongly believe that we only evolve when we get hands-on and go deeper into something. This is what I am doing for myself, and I hope to share these learnings here on the blog.

Share18Share101Send
Previous Post

Matrix Transformations and Coordinate Systems with Python

Next Post

ORB-SLAM 3: A Tool for 3D Mapping and Localization

Carlos Melo

Carlos Melo

Computer Vision Engineer with a degree in Aeronautical Sciences from the Air Force Academy (AFA), Master in Aerospace Engineering from the Technological Institute of Aeronautics (ITA), and founder of Sigmoidal.

Related Posts

How to Train YOLOv9 on Custom Dataset
Computer Vision

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

by Carlos Melo
February 29, 2024
YOLOv9 para detecção de Objetos
Blog

YOLOv9: A Step-by-Step Tutorial for Object Detection

by Carlos Melo
February 26, 2024
Next Post
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Estimativa de Pose Humana com MediaPipe

Real-time Human Pose Estimation using MediaPipe

September 11, 2023
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

April 10, 2023

Build a Surveillance System with Computer Vision and Deep Learning

1
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

1
Point Cloud Processing with Open3D and Python

Point Cloud Processing with Open3D and Python

1

Fundamentals of Image Formation

0
Como equalizar histograma de imagens com OpenCV e Python

Histogram Equalization with OpenCV and Python

July 16, 2024
How to Train YOLOv9 on Custom Dataset

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

February 29, 2024
YOLOv9 para detecção de Objetos

YOLOv9: A Step-by-Step Tutorial for Object Detection

February 26, 2024
Depth Anything - Estimativa de Profundidade Monocular

Depth Estimation on Single Camera with Depth Anything

February 23, 2024

Seguir

  • Cada passo te aproxima do que realmente importa. Quer continuar avançando?

🔘 [ ] Agora não
🔘 [ ] Seguir em frente 🚀
  • 🇺🇸 Green Card por Habilidade Extraordinária em Data Science e Machine Learning

Após nossa mudança para os EUA, muitas pessoas me perguntaram como consegui o Green Card tão rapidamente. Por isso, decidi compartilhar um pouco dessa jornada.

O EB-1A é um dos vistos mais seletivos para imigração, sendo conhecido como “The Einstein Visa”, já que o próprio Albert Einstein obteve sua residência permanente através desse processo em 1933.

Apesar do apelido ser um exagero moderno, é fato que esse é um dos vistos mais difíceis de conquistar. Seus critérios rigorosos permitem a obtenção do Green Card sem a necessidade de uma oferta de emprego.

Para isso, o aplicante precisa comprovar, por meio de evidências, que está entre os poucos profissionais de sua área que alcançaram e se mantêm no topo, demonstrando um histórico sólido de conquistas e reconhecimento.

O EB-1A valoriza não apenas um único feito, mas uma trajetória consistente de excelência e liderança, destacando o conjunto de realizações ao longo da carreira.

No meu caso específico, após escrever uma petição com mais de 1.300 páginas contendo todas as evidências necessárias, tive minha solicitação aprovada pelo USCIS, órgão responsável pela imigração nos Estados Unidos.

Fui reconhecido como um indivíduo com habilidade extraordinária em Data Science e Machine Learning, capaz de contribuir em áreas de importância nacional, trazendo benefícios substanciais para os EUA.

Para quem sempre me perguntou sobre o processo de imigração e como funciona o EB-1A, espero que esse resumo ajude a esclarecer um pouco mais. Se tiver dúvidas, estou à disposição para compartilhar mais sobre essa experiência! #machinelearning #datascience
  • 🚀Domine a tecnologia que está revolucionando o mundo.

A Pós-Graduação em Visão Computacional & Deep Learning prepara você para atuar nos campos mais avançados da Inteligência Artificial - de carros autônomos a robôs industriais e drones.

🧠 CARGA HORÁRIA: 400h
💻 MODALIDADE: EAD
📅 INÍCIO DAS AULAS: 29 de maio

Garanta sua vaga agora e impulsione sua carreira com uma formação prática, focada no mercado de trabalho.

Matricule-se já!

#deeplearning #machinelearning #visãocomputacional
  • Green Card aprovado! 🥳 Despedida do Brasil e rumo à nova vida nos 🇺🇸 com a família!
  • Haverá sinais… aprovado na petição do visto EB1A, visto reservado para pessoas com habilidades extraordinárias!

Texas, we are coming! 🤠
  • O que EU TENHO EM COMUM COM O TOM CRUISE??

Clama, não tem nenhuma “semana” aberta. Mas como@é quinta-feira (dia de TBT), olha o que eu resgatei!

Diretamente do TÚNEL DO TEMPO: Carlos Melo &Tom Cruise!
  • Bate e Volta DA ITÁLIA PARA A SUÍÇA 🇨🇭🇮🇹

Aproveitei o dia de folga após o Congresso Internacional de Astronáutica (IAC 2024) e fiz uma viagem “bate e volta” para a belíssima cidade de Lugano, Suíça.

Assista ao vlog e escreve nos comentários se essa não é a cidade mais linda que você já viu!

🔗 LINK NOS STORIES
  • Um paraíso de águas transparentes, e que fica no sul da Suíça!🇨🇭 

Conheça o Lago de Lugano, cercado pelos Alpes Suíços. 

#suiça #lugano #switzerland #datascience
  • Sim, você PRECISA de uma PÓS-GRADUAÇÃO em DATA SCIENCE.
  • 🇨🇭Deixei minha bagagem em um locker no aeroporto de Milão, e vim aproveitar esta última semana nos Alpes suíços!
  • Assista à cobertura completa no YT! Link nos stories 🚀
  • Traje espacial feito pela @axiom.space em parceria com a @prada 

Esse traje será usados pelos astronautas na lua.
para acompanhar as novidades do maior evento sobre espaço do mundo, veja os Stories!

#space #nasa #astronaut #rocket
  • INTERNATIONAL ASTRONAUTICAL CONGRESS - 🇮🇹IAC 2024🇮🇹

Veja a cobertura completa do evento nos DESTAQUES do meu perfil.

Esse é o maior evento de ESPAÇO do mundo! Eu e a @bnp.space estamos representando o Brasil nele 🇧🇷

#iac #space #nasa #spacex
  • 🚀 @bnp.space is building the Next Generation of Sustainable Rocket Fuel.

Join us in transforming the Aerospace Sector with technological and sustainable innovations.
  • 🚀👨‍🚀 Machine Learning para Aplicações Espaciais

Participei do maior congresso de Astronáutica do mundo, e trouxe as novidades e oportunidade da área de dados e Machine Learning para você!

#iac #nasa #spacex
  • 🚀👨‍🚀ACOMPANHE NOS STORIES

Congresso Internacional de Astronáutica (IAC 2024), Milão 🇮🇹
Instagram Youtube LinkedIn Twitter
Sigmoidal

O melhor conteúdo técnico de Data Science, com projetos práticos e exemplos do mundo real.

Seguir no Instagram

Categories

  • Aerospace Engineering
  • Blog
  • Carreira
  • Computer Vision
  • Data Science
  • Deep Learning
  • Featured
  • Iniciantes
  • Machine Learning
  • Posts

Navegar por Tags

3d 3d machine learning 3d vision apollo 13 bayer filter camera calibration career cientista de dados clahe computer vision custom dataset Data Clustering data science deep learning depth anything depth estimation detecção de objetos digital image processing histogram histogram equalization image formation job keras lens lente machine learning machine learning engineering nasa object detection open3d opencv pinhole profissão projeto python redes neurais roboflow rocket scikit-learn space tensorflow tutorial visão computacional yolov8 yolov9

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Home
  • Cursos
  • Pós-Graduação
  • Blog
  • Sobre Mim
  • Contato
  • Português

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.