fbpx
Sigmoidal
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
  • Português
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
Sigmoidal
No Result
View All Result

Building a Deep Learning Neural Network using TensorFlow

Carlos Melo by Carlos Melo
March 31, 2023
in Deep Learning
0
259
SHARES
8.6k
VIEWS
Share on LinkedInShare on FacebookShare on Whatsapp

In this tutorial, you will learn how to implement Multilayer Neural Networks using Python and the Deep Learning library Keras, one of the most popular libraries currently.

Keras is a Neural Network library that can run with TensorFlow (not only) and was developed to enable easy and fast prototyping. Multilayer Neural Networks are those in which neurons are structured in two or more layers (layers) of processing (since at least there will be 1 input layer and 1 output layer).

.

Implementing a complete Neural Network architecture from scratch is a Herculean task, requiring a more solid understanding of Programming, Linear Algebra, and Statistics – not to mention that the computational performance of your implementation will hardly match the performance of famous libraries in the community.

To show that with a few lines of code, it is possible to implement a simple Network, let’s take the well-known MNIST classification problem and see the performance of our algorithm when submitted to this dataset with 70,000 images!

If you want, all the code will be available on my Github!

Neural Network Architecture

Neural Networks are modeled as a set of neurons connected as an acyclic graph. What does this mean in practice? It means that the outputs of some neurons will be the inputs of other neurons.

The most commonly found Neural Networks out there are those organized into distinct layers (layers), each layer containing a set of neurons. The most commonly found layer type is the fully-connected layer. In this type, neurons between two adjacent layers connect two by two.

This type of architecture is also known as Feedforward Neural Network because only a neuron from the layer $l_i$ is allowed to connect to a neuron from the layer $l_{I+1}, as illustrated in the figure below.

What is MNIST

MNIST is a dataset containing thousands of handwritten images of digits 0-9. The challenge in this dataset is, given any image, to apply the corresponding label (correctly classify the image). MNIST is so studied and used by the community that it acts as a benchmark to compare different image recognition algorithms.

MnistExamples

The complete dataset consists of 70,000 images, each 28 x 28 pixels in size. The figure above shows some random examples from the dataset for each of the possible digits. It should be noted that the images are already normalized and centered. Since the images are in grayscale, i.e., they have only one channel, the value corresponding to each pixel in the images should vary within the range [0, 255].

Multilayer Neural Networks are those in which neurons are structured in two or more layers (layers) of processing.
Ian Goodfellow

MNIST in Python

As it is so widely used, the MNIST dataset is already available within the scikit-learn library and can be imported directly by Python with fetch_mldata("MNIST Original"). To illustrate how to import the complete MNIST and extract some basic information, let’s run the code below:

# Importing the necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist

# Loading the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Checking the shape of the data
print("Train Images Shape:", train_images.shape)
print("Train Labels Shape:", train_labels.shape)
print("Test Images Shape:", test_images.shape)
print("Test Labels Shape:", test_labels.shape)

# Visualizing an example image
example_image = train_images[0]
plt.imshow(example_image, cmap='gray')
plt.title(f"Label: {train_labels[0]}")
plt.show()

Above, we can see that the array containing the images indeed has 70,000 rows (one row for each image) and 784 columns (all the 28 x 28 pixels of the image). We can also see one of the randomly chosen images. In the case of this digit, our algorithm would have been successful if it could correctly classify the digit as ‘4’.

Implementing our Neural Network with Python and TensorFlow (Keras)

After a brief introduction to Neural Networks, let’s implement a Feedforward Neural Network for the MNIST classification problem. Create a new file in your favorite IDE, named rede_neural_keras.py, and follow the steps of the code below.

# import necessary packages
from sklearn.datasets import fetch_mldata
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.preprocessing import LabelBinarizer
from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import SGD
import numpy as np
import matplotlib.pyplot as plt

Above, we have imported all the necessary packages to create a simple Neural Network with the Keras library. If you encountered an error while trying to import the packages, or do not have a dedicated Python virtual environment for working with Deep Learning/Computer Vision, I recommend looking for a tutorial based on your Operating System.

# import MNIST
print("[INFO] importing MNIST...")
dataset = fetch_mldata("MNIST Original")

# normalize all pixels, so that the values are
# in the range [0, 1.0]
data = dataset.data.astype("float") / 255.0
labels = dataset.target

After importing the set of images, I will divide it into a training set (75%) and a test set (25%), a well-known practice in the Data Science universe. But be careful! The training and test sets MUST BE INDEPENDENT to avoid various problems, including overfitting.

Although it seems complicated, this can be done with just one line of code, thanks to the scikit-learn library, which can be done easily with the train_test_split method. At this stage of preparing our data, we will also need to convert the labels (which are represented by integers) to binary vector format. To illustrate what a binary vector is, see the example below, which indicates the label 4.

4 = [0,0,0,0,1,0,0,0,0,0]

In this vector, the value 1 is assigned to the index corresponding to the label and the value 0 to the others. This operation, known as one-hot encoding, can also be done easily with the LabelBinarizer class.

# split the dataset into train (75%) and test (25%)
(trainX, testX, trainY, testY) = train_test_split(data, dataset.target)

# convert labels from integers to vectors
lb = LabelBinarizer()
trainY = lb.fit_transform(trainY)
testY = lb.transform(testY)

Great! With the dataset imported and processed correctly, we can finally define the architecture of our Neural Network with Keras. Arbitrarily, I defined that the Neural Network will have 4 layers:

  1. Our first layer (l0) will receive as input the values ​​related to each pixel of the images. So, since each image has a size equal to 28 x 28 pixels, l0 will have 784 neurons.
  2. The hidden layers l1 and l2 will arbitrarily have 128 and 64 neurons.
  3. Finally, the last layer, l3, will have the number of neurons corresponding to the number of classes in our classification problem: 10 (remember, there are 10 possible digits).
# define the Neural Network architecture using Keras
# 784 (input) => 128 (hidden) => 64 (hidden) => 10 (output)
model = Sequential() model.add(Dense(128, input_shape=(784,), activation="sigmoid")) model.add(Dense(64, activation="sigmoid")) model.add(Dense(10, activation="softmax"))

Within the concept of feedforward architecture, our Neural Network is instantiated by the Sequential class, which means that each layer will be “stacked” on top of another, with the output of one being the input of the next. In our example, all layers are of the fully-connected layer type.

The hidden layers will be activated by the sigmoid function, which takes the real values of the neurons as input and places them within the range [0, 1]. For the last layer, which needs to reflect the probabilities for each of the possible classes, the softmax function will be used.

To train our model, I will use the most important algorithm for Neural Networks: Stochastic Gradient Descent (SGD). I want to make a dedicated post about SGD in the future (mathematics + code), given its importance! But for now, let’s use the algorithm already available in our libraries. The learning rate of SGD will be set to 0.01, and the loss function will be categorical_crossentropy since the number of output classes is greater than two.

# train the model using SGD (Stochastic Gradient Descent)
print("[INFO] training the neural network...")
model.compile(optimizer=SGD(0.01), loss="categorical_crossentropy",
metrics=["accuracy"])

H = model.fit(trainX, trainY, batch_size=128, epochs=10, verbose=2,
validation_data=(testX, testY))

The power of Deep Learning comes primarily from a single, very important algorithm: Stochastic Gradient Descent (SGD).

By calling model.fit, the training of the neural network begins. After a time that varies depending on your machine, the weights of each node are optimized, and the network can be considered trained.

To evaluate the performance of the algorithm, we call the model.predict method to generate predictions on top of the test dataset. The model’s challenge is to make predictions for the 17,500 images that make up the test set, assigning a label from 0-9 to each one:

# evaluate the Neural Network
print("[INFO] evaluating the neural network...")
predictions = model.predict(testX, batch_size=128)
print(classification_report(testY.argmax(axis=1), predictions.argmax(axis=1)))

Finally, after obtaining the performance report, we will want to plot the accuracy and loss over the iterations. Visually analyzing allows us to easily identify situations of overfitting, for example:

# plot loss and accuracy for the 'train' and 'test' datasets
plt.style.use("ggplot") plt.figure() plt.plot(np.arange(0,100), H.history["loss"], label="train_loss") plt.plot(np.arange(0,100), H.history["val_loss"], label="val_loss") plt.plot(np.arange(0,100), H.history["acc"], label="train_acc") plt.plot(np.arange(0,100), H.history["val_acc"], label="val_acc") plt.title("Training Loss and Accuracy") plt.xlabel("Epoch #") plt.ylabel("Loss/Accuracy") plt.legend() plt.show()

Executing the Neural Network

With the code ready, you just need to execute the command below to see our Neural Network built on top of the Keras library in full operation:

$ python neural_network_keras.py

As a result, the classification_report shows that after 100 epochs, the network achieved an accuracy of 92%, which is a good result for this type of architecture. Just out of curiosity, Convolutional Neural Networks have the potential to reach up to 99% accuracy (!):

Obviously, there are many improvements that can be made to enhance the performance of our network, but even a simple architecture already delivers excellent performance.

Looking at the graph below, note how the curves related to the training and validation datasets are practically overlapping. This is a great indication that there were no overfitting problems during the training phase.

Summary

Well, reaching the end of the post, the basic concepts of Neural Networks were introduced, as well as the MNIST dataset, widely used to benchmark algorithms. By testing the performance of a neural network with 4 layers (input+2 hidden layers + output), we managed to achieve 92% accuracy in the predictions made.

The implementation was done using Keras, showing that with just a few lines of code, it is possible to build great classification models.

Finally, I would like to say that I intend to produce articles not only focused on code writing/implementation, but also delving deeper into the theoretical and mathematical concepts behind machine learning algorithms and methods.

I strongly believe that we only evolve when we get hands-on and go deeper into something. This is what I am doing for myself, and I hope to share these learnings here on the blog.

Share18Share104Send
Previous Post

Matrix Transformations and Coordinate Systems with Python

Next Post

ORB-SLAM 3: A Tool for 3D Mapping and Localization

Carlos Melo

Carlos Melo

Computer Vision Engineer with a degree in Aeronautical Sciences from the Air Force Academy (AFA), Master in Aerospace Engineering from the Technological Institute of Aeronautics (ITA), and founder of Sigmoidal.

Related Posts

How to Train YOLOv9 on Custom Dataset
Computer Vision

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

by Carlos Melo
February 29, 2024
YOLOv9 para detecção de Objetos
Blog

YOLOv9: A Step-by-Step Tutorial for Object Detection

by Carlos Melo
February 26, 2024
Next Post
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Estimativa de Pose Humana com MediaPipe

Real-time Human Pose Estimation using MediaPipe

September 11, 2023
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

April 10, 2023

Build a Surveillance System with Computer Vision and Deep Learning

1
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

1
Point Cloud Processing with Open3D and Python

Point Cloud Processing with Open3D and Python

1

Fundamentals of Image Formation

0

What is Sampling and Quantization in Image Processing

June 20, 2025
Como equalizar histograma de imagens com OpenCV e Python

Histogram Equalization with OpenCV and Python

July 16, 2024
How to Train YOLOv9 on Custom Dataset

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

February 29, 2024
YOLOv9 para detecção de Objetos

YOLOv9: A Step-by-Step Tutorial for Object Detection

February 26, 2024

Seguir

  • Aqui nós 🇺🇸, a placa é sua. Quando você troca o carro,  por exemplo, você mesmo tira a sua placa do carro vendido e instala a parafusa no carro novo.

Por exemplo, hoje eu vi aqui no “Detran” dos Estados Unidos, paguei a trasnferência do title do veículo, e já comprei minha primeira placa. 

Tudo muito fácil e rápido. Foi menos de 1 hora para resolver toda a burocracia! #usa🇺🇸 #usa
  • Como um carro autônomo "enxerga" o mundo ao redor?

Não há olhos nem intuição, apenas sensores e matemática. Cada imagem capturada passa por um processo rigoroso: amostragem espacial, quantização de intensidade e codificação digital. 

Esse é o desafio, representar um objeto 3D do mundo real, em pixels que façam sentido para a Inteligência Artificial.

🚗📷 A visão computacional é a área mais inovadora do mundo!

Comente aqui se você concorda.

#carrosautonomos #inteligenciaartificial #IA #visãocomputacional
  • 👁️🤖Visão Computacional: a área mais inovadora do mundo! Clique no link da bio e se inscreva na PÓS EM VISÃO COMPUTACIONAL E DEEP LEARNING! #machinelearning #datascience #visãocomputacional
  • E aí, Sergião @spacetoday Você tem DADO em casa? 😂😂

A pergunta pode ter ficado sem resposta no dia. Mas afinal, o que são “dados”?

No mundo de Data Science, dados são apenas registros brutos. Números, textos, cliques, sensores, imagens. Sozinhos, eles não dizem nada 

Mas quando aplicamos técnicas de Data Science, esses dados ganham significado. Viram informação.

E quando a informação é bem interpretada, ela se transforma em conhecimento. Conhecimento gera vantagem estratégica 🎲

Hoje, Data Science não é mais opcional. É essencial para qualquer empresa que quer competir de verdade.

#datascience #cientistadedados #machinelearning
  • 🎙️ Corte da minha conversa com o Thiago Nigro, no PrimoCast #224

Falamos sobre por que os dados são considerados o novo petróleo - para mim, dados são o novo bacon!

Expliquei como empresas que dominam a ciência de dados ganham vantagem real no mercado. Não por armazenarem mais dados, mas por saberem o que fazer com eles.

Também conversamos sobre as oportunidades para quem quer entrar na área de tecnologia. Data Science é uma das áreas mais democráticas que existem. Não importa sua idade, formação ou cidade. O que importa é a vontade de aprender.

Se você quiser ver o episódio completo, é só buscar por Primocast 224.

“O que diferencia uma organização de outra não é a capacidade de armazenamento de dados; é a capacidade de seu pessoal extrair conhecimento desses dados.”

#machinelearning #datascience #visãocomputacional #python
  • 📸 Palestra que realizei no palco principal da Campus Party #15, o maior evento de tecnologia da América Latina!

O tema que escolhi foi "Computação Espacial", onde destaquei as inovações no uso de visão computacional para reconstrução 3D e navegação autônoma.

Apresentei técnicas como Structure-from-Motion (SFM), uma técnica capaz de reconstruir cidades inteiras (como Roma) usando apenas fotos publicadas em redes sociais, e ORB-SLAM, usada por drones e robôs para mapeamento em tempo real.

#visãocomputacional #machinelearning #datascience #python
  • ⚠️❗ Não deem ideia para o Haddad! 

A França usou Inteligência Artificial para detectar mais de 20 mil piscinas não declaradas a partir de imagens aéreas.

Com modelos de Deep Learning, o governo identificou quem estava devendo imposto... e arrecadou mais de €10 milhões com isso.

Quer saber como foi feito? Veja no post completo no blog do Sigmoidal: https://sigmoidal.ai/como-a-franca-usou-inteligencia-artificial-para-detectar-20-mil-piscinas/

#datascience #deeplearning #computerVision #IA
  • Como aprender QUALQUER coisa rapidamente?

💡 Comece com projetos reais desde o primeiro dia.
📁 Crie um portfólio enquanto aprende. 
📢 E compartilhe! Poste, escreva, ensine. Mostre o que está fazendo. Documente a jornada, não o resultado.

Dois livros que mudaram meu jogo:
-> Ultra Aprendizado (Scott Young)
-> Uma Vida Intelectual (Sertillanges)

Aprenda em público. Evolua fazendo.

#ultralearning #estudos #carreira
  • Como eu usava VISÃO COMPUTACIONAL no Centro de Operações Espaciais, planejando missões de satélites em situações de desastres naturais.

A visão computacional é uma fronteira fascinante da tecnologia que transforma a forma como entendemos e respondemos a desastres e situações críticas. 

Neste vídeo, eu compartilho um pouco da minha experiência como Engenheiro de Missão de Satélite e especialista em Visão Computacional. 

#VisãoComputacional #DataScience #MachineLearning #Python
  • 🤔 Essa é a MELHOR linguagem de programação, afinal?

Coloque sua opinião nos comentários!

#python #datascience #machinelearning
  • 💘 A história de como conquistei minha esposa... com Python!

Lá em 2011, mandei a real:

“Eu programo em Python.”
O resto é história.
  • Para rotacionar uma matriz 2D em 90°, primeiro inverto a ordem das linhas (reverse). Depois, faço a transposição in-place. Isso troca matrix[i][j] com matrix[j][i], sem criar outra matriz. A complexidade segue sendo O(n²), mas o uso de memória se mantém O(1).

Esse padrão aparece com frequência em entrevistas. Entender bem reverse + transpose te prepara para várias variações em matrizes.

#machinelearning #visaocomputacional #leetcode
  • Na última aula de estrutura de dados, rodei um simulador de labirintos para ensinar como resolver problemas em grids e matrizes.

Mostrei na prática a diferença entre DFS e BFS. Enquanto a DFS usa stacks, a BFS utiliza a estrutura de fila (queue). Cada abordagem tem seu padrão de propagação e uso ideal.

#machinelearning #visaocomputacional #algoritmos
  • 🔴 Live #2 – Matrizes e Grids: Fundamentos e Algoritmos Essenciais

Na segunda aula da série de lives sobre Estruturas de Dados e Algoritmos, o foco será em Matrizes e Grids, estruturas fundamentais em problemas de caminho, busca e representação de dados espaciais.

📌 O que você vai ver:

Fundamentos de matrizes e grids em programação
Algoritmos de busca: DFS e BFS aplicados a grids
Resolução ao vivo de problemas do LeetCode

📅 Terça-feira, 01/07, às 22h no YouTube 
🎥 (link nos Stories)

#algoritmos #estruturasdedados #leetcode #datascience #machinelearning
  • 💡 Quer passar em entrevistas técnicas?
Veja essa estratégia para você estudar estruturas de dados em uma sequência lógica e intuitiva.
⠀
👨‍🏫 NEETCODE.io
⠀
🚀 Marque alguém que também está se preparando!

#EntrevistaTecnica #LeetCode #MachineLearning #Data Science
  • Live #1 – Arrays & Strings: Teoria e Prática para Entrevistas Técnicas

Segunda-feira eu irei começar uma série de lives sobre Estruturas de Dados e Algoritmos. 

No primeiro encontro, falarei sobre um dos tipos de problemas mais cobrados em entrevistas: Arrays e Strings.

Nesta aula, você vai entender a teoria por trás dessas estruturas, aprender os principais padrões de resolução de problemas e aplicar esse conhecimento em exercícios selecionados do LeetCode.

📅 Segunda-feira, 23/06, às 21h no YouTube

🎥 (link nos Stories)

#machinelearning #datascience #cienciadedados #visãocomputacional
Instagram Youtube LinkedIn Twitter
Sigmoidal

O melhor conteúdo técnico de Data Science, com projetos práticos e exemplos do mundo real.

Seguir no Instagram

Categories

  • Aerospace Engineering
  • Blog
  • Carreira
  • Computer Vision
  • Data Science
  • Deep Learning
  • Featured
  • Iniciantes
  • Machine Learning
  • Posts

Navegar por Tags

3d 3d machine learning 3d vision apollo 13 bayer filter camera calibration career cientista de dados clahe computer vision custom dataset data science deep learning depth anything depth estimation detecção de objetos digital image processing histogram histogram equalization image formation job lens lente machine learning machine learning engineering nasa object detection open3d opencv pinhole projeto python quantization redes neurais roboflow rocket salário sampling scikit-learn space tensorflow tutorial visão computacional yolov8 yolov9

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Home
  • Cursos
  • Pós-Graduação
  • Blog
  • Sobre Mim
  • Contato
  • Português

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.