Where Are Pixels? -- a Deep Learning Perspective

Technically, an image is a function that maps a continuous domain, e.g. a box $[0, X] \times [0, Y]$, to intensities such as (R, G, B). To store it on computer memory, an image is discretized to an array array[H][W], where each element array[i][j] is a pixel.

How does discretization work? How does a discrete pixel relate to the abstract notion of the underlying continuous image? These basic questions play an important role in computer graphics & computer vision algorithms.

This article discusses these low level details, and how they affect our CNN models and deep learning libraries. If you ever wonder which resize function to use or whether you should add/subtract 0.5 or 1 to some pixel coordinates, you may find answers here. Interestingly, these details have contributed to many accuracy improvements in Detectron and Detectron2.

On Environment/Package Management in Python

I'm involved in a few open source projects and I often help users address their environment / installation issues. A large number of these environment issues essentially come down to incorrectly / accidentally mixing multiple different python environment together. This post lists a few common pitfalls / misconceptions of such.

Unawareness of Deep Learning Mistakes

TL;DR: People are hardly aware of any deep learning mistakes they made, because things always appear to work, and there are no expectations on how well they should work. The solution is to try to accurately reproduce settings & performance of high-quality papers & code.