fbpx
Sigmoidal
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
  • Português
  • Home
  • LinkedIn
  • About me
  • Contact
No Result
View All Result
Sigmoidal
No Result
View All Result

Fundamentals of Image Formation

In this lesson, you'll learn the theory behind image formation and digital images. This article is the first in a series called Computer Vision: Algorithms and Applications.

Carlos Melo by Carlos Melo
March 22, 2023
in Computer Vision
0
160
SHARES
5.3k
VIEWS
Share on LinkedInShare on FacebookShare on Whatsapp

In a world full of mysteries and wonders, photography stands tall as a phenomenon that captures the ephemeral and eternal in a single moment. Like a silent dance between light and shadow, it invites our imagination to wander through the corridors of time and space. Through a surprisingly simple process, capturing rays of light through an aperture and exposure time, we are led to contemplate photographs that we know will remain everlasting.

The philosopher José Ortega y Gasset once reflected on the passion for truth as the noblest and most inexorable pursuit. And undoubtedly, photography is one of the most sublime expressions of this quest for truth, capturing reality in a fragment of time.

Behind this process lies the magic of matrices, projections, coordinate transformations, and mathematical models that, like invisible threads, weave the tapestry between the reality captured by a camera lens and the bright pixels on your screen.

But to understand how it’s possible to mathematically model the visual world, with all its richness of detail, we must first understand why vision is so complex and challenging. In this first article of the series “Computer Vision: Algorithms and Applications,” I want to invite you to discover how machines see an image and how an image is formed.

The challenges in Computer Vision

Computer vision is a fascinating field that seeks to develop mathematical techniques capable of reproducing the three-dimensional perception of the world around us. Richard Szeliski, in his book “Computer Vision: Algorithms and Applications,” describes how, with apparent ease, we perceive the three-dimensional structure of the world around us and the richness of detail we can extract from a simple image. However, computer vision faces difficulties in reproducing this level of detail and accuracy.

Szeliski points out that, despite advances in computer vision techniques over the past decades, we still can’t make a computer explain an image with the same level of detail as a two-year-old child. Vision is an inverse problem, where we seek to recover unknown information from insufficient data to fully specify the solution. To solve this problem, it is necessary to resort to models based on physics and probability, or machine learning with large sets of examples.

Schematic representing the physical principle of optical remote sensing, through the interaction between the surface, solar energy, and sensor.

Modeling the visual world in all its complexity is a greater challenge than, for example, modeling the vocal tract that produces spoken sounds. Computer vision seeks to describe and reconstruct properties such as shape, lighting, and color distribution from one or more images, something humans and animals do with ease, while computer vision algorithms are prone to errors.

How an Image is Formed

Before analyzing and manipulating images, it’s essential to understand the image formation process. As examples of components in the process of producing a given image, Szeliski (2022) cites:

  1. Perspective projection: The way three-dimensional objects are projected onto a two-dimensional image, taking into account the position and orientation of the objects relative to the camera.
  2. Light scattering after hitting the surface: The way light scatters after interacting with the surface of objects, influencing the appearance of colors and shadows in the image.
  3. Lens optics: The process by which light passes through a lens, affecting image formation due to refraction and other optical phenomena.
  4. Bayer color filter array: A color filter pattern used in most digital cameras to capture colors at each pixel, allowing for the reconstruction of the original colors of the image.

Regarding the image formation process, it’s quite simple geometrically. An object reflects the light that strikes it, and this light is captured by a sensor, forming an image after a certain exposure time. But if it were that simple, given the large number of light rays coming from so many different angles, our sensor wouldn’t be able to focus on anything and would only display a certain luminous blur.

To ensure that each part of the scene strikes only one point of the sensor, it’s possible to introduce an optical barrier with a hole that allows only a portion of the light rays to pass through, reducing blur and providing a sharper image. This hole placed in the barrier is called an aperture or pinhole, and it’s crucial for forming a sharp image, allowing cameras and other image capture devices to function properly.

A photographic camera that does not have a lens is known as a “pinhole” camera, which means “pinhole”.

This principle of physics, known as the camera obscura, serves as the basis for the construction of any photographic camera. An ideal pinhole camera model has an infinitely small hole to obtain an infinitely sharp image.

However, the problem with pinhole cameras is that there is a trade-off between sharpness and brightness. The smaller the hole, the sharper the image. But since the amount of light passing through is smaller, it’s necessary to increase the exposure time.

Moreover, if the hole is of the same order of magnitude as the wavelength of light, we will have the effect of diffraction, which ends up distorting the image. In practice, a hole smaller than 0.3 mm will cause interference in light waves, making the image blurry.

The solution to this problem is the use of lenses. In this case, a thin converging lens will allow the ray passing through the center of the lens not to be deflected and all rays parallel to the optical axis to intersect at a single point (focal point).

The Magic of Lenses in Image Formation

Lenses are essential optical elements in image formation, as they allow more light to be captured by the sensor while still maintaining the sharpness of the image. Lenses work by refracting the light that passes through them, directing the light rays to the correct points on the sensor.

In the context of camera calibration, the thin converging lens is used as a simplified model to describe the relationship between the three-dimensional world and the two-dimensional image captured by the camera’s sensor. This theoretical model is useful for understanding the basic principles of geometric optics and simplifying the calculations involved in camera calibration, and it should satisfy two properties:

  1. Rays passing through the Optical Center are not deflected; and
  2. All rays parallel to the Optical Axis converge at the Focal Point.

As we’ll see in the next article, camera calibration involves determining the intrinsic and extrinsic parameters that describe the relationship between the real-world coordinates and the image coordinates. The intrinsic parameters include the focal length, the principal point, and lens distortion, while the extrinsic parameters describe the position and orientation of the camera relative to the world.

Although the thin lens model is a simplification of the actual optical system of a camera, it can be used as a starting point for calibration.

Focus and Focal Length

Focus is one of the main aspects of image formation with lenses. The focal length, represented by f, is the distance between the center of the lens and the focal point, where light rays parallel to the optical axis converge after passing through the lens.

Thin Lens Equation. Source: Davide Scaramuzza (2022).

The focal length is directly related to the lens’s ability to concentrate light and, consequently, influences the sharpness of the image. The focus equation is given by:

    \[ \frac{1}{f} = \frac{1}{z} + \frac{1}{e} \]

where z is the distance between the object and the lens, and e is the distance between the formed image and the lens. This equation describes the relationship between the focal length, the object distance, and the formed image distance.

Aperture and Depth of Field

Aperture is another essential aspect of image formation with lenses. The aperture, usually represented by an f-number value, controls the amount of light that passes through the lens. A smaller f-number value indicates a larger aperture, allowing more light in and resulting in brighter images.

Aperture also affects the depth of field, which is the range of distance at which objects appear sharp in the image. A larger aperture (smaller f-number value) results in a shallower depth of field, making only objects close to the focal plane appear sharp, while objects farther away or closer become blurred.

This characteristic can be useful for creating artistic effects, such as highlighting a foreground object and blurring the background.

Focal Length and Angle of View

The lens’s focal length also affects the angle of view, which is the extent of the scene captured by the camera. Lenses with a shorter focal length have a wider angle of view, while lenses with a longer focal length have a narrower angle of view. Wide-angle lenses, for example, have short focal lengths and are capable of capturing a broad view of the scene. Telephoto lenses, on the other hand, have long focal lengths and are suitable for capturing distant objects with greater detail.

Focal Length & Angle of View guide.

By selecting the appropriate lens, it is possible to adjust the composition and framing of the image, as well as control the amount of light entering the sensor and the depth of field. Furthermore, the use of lenses allows for manipulation of perspective and capturing subtle details that would be impossible to record with a pinhole model.

In summary, the lens is a crucial component in image formation, allowing photographers and filmmakers to control and shape light effectively and creatively. With proper knowledge about lens characteristics and their implications in image formation, it is possible to explore the full potential of cameras and other image capturing devices, creating truly stunning and expressive images.

Capture and Representation of Digital Images

Digital cameras use an array of photodiodes (CCD or CMOS) to convert photons (light energy) into electrons, differing from analog cameras that use photographic film to record images. This technology allows capturing and storing images in digital format, simplifying the processing and sharing of photos.

Digital images are organized as a matrix of pixels, where each pixel represents the light intensity at a specific point in the image. A common example of a digital image is an 8-bit image, in which each pixel has an intensity value ranging from 0 to 255. This range of values is a result of using 8 bits to represent intensity, which allows a total of 2^8 = 256 distinct values for each pixel.

Digital images are organized as a matrix of pixels, where each pixel represents the light intensity at a specific point in the image. A common example of a digital image is an 8-bit image, in which each pixel has an intensity value ranging from 0 to 255. This range of values is a result of using 8 bits to represent intensity, which allows a total of 2^8 = 256 distinct values for each pixel.

No modelo RGB, atribui-se um valor de intensidade a cada pixel. No caso das imagens coloridas de 8 bits por canal, os valores de intensidade variam de 0 (preto) a 255 (branco) para cada um dos componentes das cores vermelho, verde e azul.

In the figure above, we see an example of how a machine would “see” a Brazilian Air Force aircraft. In this case, each pixel has a vector of values associated with each of the RGB channels.

Digital cameras typically adopt an RGB color detection system, where each color is represented by a specific channel (red, green, and blue). One of the most common methods for capturing these colors is the Bayer pattern, developed by Bryce Bayer in 1976 while working at Kodak. The Bayer pattern consists of an alternating array of RGB filters placed over the pixel array.

It is interesting to note that the number of green filters is twice that of red and blue filters, as the luminance signal is mainly determined by the green values, and the human visual system is much more sensitive to spatial differences in luminance than chrominance. For each pixel, missing color components can be estimated from neighboring values through interpolation – a process known as demosaicing.

Bayer Filter Pattern Scheme, showing the interaction between visible light, color filters, microlenses, and sensor in capturing vibrant and detailed colors in digital cameras.

However, it is important to emphasize that this is just a common example. In practice, a digital image can have more bits and more channels. Besides the RGB color space, there are several other color spaces, such as YUV, which can also be used in the representation and processing of digital images.

For example, during the period I worked at the Space Operations Center, I received monochromatic images with radiometric resolution of 10 bits per pixel and hyperspectral images with hundreds of channels for analysis.

Summary

This article presented the fundamentals of image formation, exploring the challenges of computer vision, the optical process of capture, the relevance of lenses, and the representation of digital images.

In the second article of this series, I will teach you how to implement a practical example in Python to convert the coordinates of a real 3D object to a 2D image, and how to perform camera calibration (one of the most important areas in Computer Vision).

 

References

  1. Szeliski, R. (2020). Computer Vision: Algorithms and Applications. Springer.
  2. Gonzalez, R. C., & Woods, R. E. (2018). Digital Image Processing. Pearson Education.
Share11Share64Send
Next Post

Build a Surveillance System with Computer Vision and Deep Learning

Carlos Melo

Carlos Melo

Computer Vision Engineer with a degree in Aeronautical Sciences from the Air Force Academy (AFA), Master in Aerospace Engineering from the Technological Institute of Aeronautics (ITA), and founder of Sigmoidal.

Related Posts

Como equalizar histograma de imagens com OpenCV e Python
Computer Vision

Histogram Equalization with OpenCV and Python

by Carlos Melo
July 16, 2024
How to Train YOLOv9 on Custom Dataset
Computer Vision

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

by Carlos Melo
February 29, 2024
YOLOv9 para detecção de Objetos
Blog

YOLOv9: A Step-by-Step Tutorial for Object Detection

by Carlos Melo
February 26, 2024
Depth Anything - Estimativa de Profundidade Monocular
Computer Vision

Depth Estimation on Single Camera with Depth Anything

by Carlos Melo
February 23, 2024
Point Cloud Processing with Open3D and Python
Computer Vision

Point Cloud Processing with Open3D and Python

by Carlos Melo
February 12, 2024
Next Post

Build a Surveillance System with Computer Vision and Deep Learning

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Trending
  • Comments
  • Latest
Estimativa de Pose Humana com MediaPipe

Real-time Human Pose Estimation using MediaPipe

September 11, 2023
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

April 10, 2023

Build a Surveillance System with Computer Vision and Deep Learning

1
ORB-SLAM 3: A Tool for 3D Mapping and Localization

ORB-SLAM 3: A Tool for 3D Mapping and Localization

1
Point Cloud Processing with Open3D and Python

Point Cloud Processing with Open3D and Python

1

Fundamentals of Image Formation

0
Como equalizar histograma de imagens com OpenCV e Python

Histogram Equalization with OpenCV and Python

July 16, 2024
How to Train YOLOv9 on Custom Dataset

How to Train YOLOv9 on Custom Dataset – A Complete Tutorial

February 29, 2024
YOLOv9 para detecção de Objetos

YOLOv9: A Step-by-Step Tutorial for Object Detection

February 26, 2024
Depth Anything - Estimativa de Profundidade Monocular

Depth Estimation on Single Camera with Depth Anything

February 23, 2024

Seguir

  • Cada passo te aproxima do que realmente importa. Quer continuar avançando?

🔘 [ ] Agora não
🔘 [ ] Seguir em frente 🚀
  • 🇺🇸 Green Card por Habilidade Extraordinária em Data Science e Machine Learning

Após nossa mudança para os EUA, muitas pessoas me perguntaram como consegui o Green Card tão rapidamente. Por isso, decidi compartilhar um pouco dessa jornada.

O EB-1A é um dos vistos mais seletivos para imigração, sendo conhecido como “The Einstein Visa”, já que o próprio Albert Einstein obteve sua residência permanente através desse processo em 1933.

Apesar do apelido ser um exagero moderno, é fato que esse é um dos vistos mais difíceis de conquistar. Seus critérios rigorosos permitem a obtenção do Green Card sem a necessidade de uma oferta de emprego.

Para isso, o aplicante precisa comprovar, por meio de evidências, que está entre os poucos profissionais de sua área que alcançaram e se mantêm no topo, demonstrando um histórico sólido de conquistas e reconhecimento.

O EB-1A valoriza não apenas um único feito, mas uma trajetória consistente de excelência e liderança, destacando o conjunto de realizações ao longo da carreira.

No meu caso específico, após escrever uma petição com mais de 1.300 páginas contendo todas as evidências necessárias, tive minha solicitação aprovada pelo USCIS, órgão responsável pela imigração nos Estados Unidos.

Fui reconhecido como um indivíduo com habilidade extraordinária em Data Science e Machine Learning, capaz de contribuir em áreas de importância nacional, trazendo benefícios substanciais para os EUA.

Para quem sempre me perguntou sobre o processo de imigração e como funciona o EB-1A, espero que esse resumo ajude a esclarecer um pouco mais. Se tiver dúvidas, estou à disposição para compartilhar mais sobre essa experiência! #machinelearning #datascience
  • 🚀Domine a tecnologia que está revolucionando o mundo.

A Pós-Graduação em Visão Computacional & Deep Learning prepara você para atuar nos campos mais avançados da Inteligência Artificial - de carros autônomos a robôs industriais e drones.

🧠 CARGA HORÁRIA: 400h
💻 MODALIDADE: EAD
📅 INÍCIO DAS AULAS: 29 de maio

Garanta sua vaga agora e impulsione sua carreira com uma formação prática, focada no mercado de trabalho.

Matricule-se já!

#deeplearning #machinelearning #visãocomputacional
  • Green Card aprovado! 🥳 Despedida do Brasil e rumo à nova vida nos 🇺🇸 com a família!
  • Haverá sinais… aprovado na petição do visto EB1A, visto reservado para pessoas com habilidades extraordinárias!

Texas, we are coming! 🤠
  • O que EU TENHO EM COMUM COM O TOM CRUISE??

Clama, não tem nenhuma “semana” aberta. Mas como@é quinta-feira (dia de TBT), olha o que eu resgatei!

Diretamente do TÚNEL DO TEMPO: Carlos Melo &Tom Cruise!
  • Bate e Volta DA ITÁLIA PARA A SUÍÇA 🇨🇭🇮🇹

Aproveitei o dia de folga após o Congresso Internacional de Astronáutica (IAC 2024) e fiz uma viagem “bate e volta” para a belíssima cidade de Lugano, Suíça.

Assista ao vlog e escreve nos comentários se essa não é a cidade mais linda que você já viu!

🔗 LINK NOS STORIES
  • Um paraíso de águas transparentes, e que fica no sul da Suíça!🇨🇭 

Conheça o Lago de Lugano, cercado pelos Alpes Suíços. 

#suiça #lugano #switzerland #datascience
  • Sim, você PRECISA de uma PÓS-GRADUAÇÃO em DATA SCIENCE.
  • 🇨🇭Deixei minha bagagem em um locker no aeroporto de Milão, e vim aproveitar esta última semana nos Alpes suíços!
  • Assista à cobertura completa no YT! Link nos stories 🚀
  • Traje espacial feito pela @axiom.space em parceria com a @prada 

Esse traje será usados pelos astronautas na lua.
para acompanhar as novidades do maior evento sobre espaço do mundo, veja os Stories!

#space #nasa #astronaut #rocket
  • INTERNATIONAL ASTRONAUTICAL CONGRESS - 🇮🇹IAC 2024🇮🇹

Veja a cobertura completa do evento nos DESTAQUES do meu perfil.

Esse é o maior evento de ESPAÇO do mundo! Eu e a @bnp.space estamos representando o Brasil nele 🇧🇷

#iac #space #nasa #spacex
  • 🚀 @bnp.space is building the Next Generation of Sustainable Rocket Fuel.

Join us in transforming the Aerospace Sector with technological and sustainable innovations.
  • 🚀👨‍🚀 Machine Learning para Aplicações Espaciais

Participei do maior congresso de Astronáutica do mundo, e trouxe as novidades e oportunidade da área de dados e Machine Learning para você!

#iac #nasa #spacex
  • 🚀👨‍🚀ACOMPANHE NOS STORIES

Congresso Internacional de Astronáutica (IAC 2024), Milão 🇮🇹
Instagram Youtube LinkedIn Twitter
Sigmoidal

O melhor conteúdo técnico de Data Science, com projetos práticos e exemplos do mundo real.

Seguir no Instagram

Categories

  • Aerospace Engineering
  • Blog
  • Carreira
  • Computer Vision
  • Data Science
  • Deep Learning
  • Featured
  • Iniciantes
  • Machine Learning
  • Posts

Navegar por Tags

3d 3d machine learning 3d vision apollo 13 bayer filter camera calibration career cientista de dados clahe computer vision custom dataset Data Clustering data science deep learning depth anything depth estimation detecção de objetos digital image processing histogram histogram equalization image formation job keras lens lente machine learning machine learning engineering nasa object detection open3d opencv pinhole profissão projeto python redes neurais roboflow rocket scikit-learn space tensorflow tutorial visão computacional yolov8 yolov9

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • Home
  • Cursos
  • Pós-Graduação
  • Blog
  • Sobre Mim
  • Contato
  • Português

© 2024 Sigmoidal - Aprenda Data Science, Visão Computacional e Python na prática.