Python OpenCV - Guide to Image Processing for AI/ML With Examples

In this topic, we’ll cover the Python OpenCV library in complete detail. Computer Vision refers to the field of study which deals with how computers perceive images. It involves feeding images into a computer and then trying to gain high-level intelligence from it using different algorithms.

It works in close coordination with fields like Machine Learning and Artificial Intelligence. Computer Vision is a broad field and is progressing rapidly.

Computer Vision has a variety of real-world applications:

  1. Object detection
  2. Facial Recognition
  3. Self-Driving Cars
  4. Cancer-Detection

One of the popular tasks under the broad field of Computer Vision is Image Processing.

Image processing involves performing some operations on an image, in order to get an enhanced image or to extract some useful information from it.

A major part of object detection is solved using Convolution Neural Networks.

What is a Convolution Neural Network?

A convolutional neural network is a class of deep neural networks that can analyze image data. It is able to draw useful high-level information from image data. These networks can be trained for recognizing objects, facial features, handwriting, and image classification.

A Convolutional Neural Network usually contains a combination of the following layers.

  • Convolutional Layers
  • Pooling Layers
  • Flattening Layers.

Let’s briefly discuss about these layers.

1. Convolution Layer

The convolution layer filters the image with a smaller pixel filter. This decreases the size of the image without losing the relationship between pixels.

2. Pooling Layer

The main job of the pooling layer is to reduce the spatial size of the image after convolution.

A pooling layer reduces the amount of parameters by selecting the maximum, average, or sum values inside the pixels.

Max pooling is the most commonly used pooling technique.

3. Flattening Layer

A flattening layer represents the multi-dimensional pixel vector as a one-dimensional pixel vector.

When it comes to Python, OpenCV is the library that offers the best image processing tools.

In this tutorial, we will learn how to read images into Python using OpenCV. We will also look at some basic image processing operations.

What is OpenCV?

OpenCV is a library of programming functions mainly aimed at real-time computer vision.

Apart from importing and saving images, OpenCV also provides image processing operations such as edge detection, segmentation, Morphological operations and lots more. We will cover some of these operations in this tutorial.

Before we move any further, let’s install OpenCV onto our system.

1. Installing OpenCV

To install OpenCV use the pip command as shown below :

Install OpenCV

Once you are done with the installation, you can get started with importing an image using OpenCV.

2. How to read images using Python OpenCV?

Let’s select a sample picture that we can import using OpenCV.

The Beatles

The Beatles

We are going to use this very popular image of ‘The Beatles‘.

To read this image using OpenCV use :

This will store the image in the variable ‘img‘. Let’s see what happens when we print this variable.


Beatles In Grayscale

Beatles In Pixel Values

We get a matrix as the output because this is how your computer perceives an image.

For a computer an image is just a collection of pixel values.

A digital image is stored as a combination of pixels in a machine. Each pixel further contains a different number of channels. If it a grayscale image, it has only one channel, whereas a colored image contains three channels: red, green, and blue. Each channel of each pixel has a value between 0 and 255.

These pixel values together make the image, which we then perceive as ‘The Beatles‘.

Let’s learn some image processing operations now.

3. Convert an Image to Grayscale using OpenCV

In this section we will convert our sample image to grayscale and the display it.

This piece of code will first convert the image into grayscale. The line of code responsible for doing that is :

It will then print the image matrix and display the resulting image.

The code for displaying any image is:

Output :

Beatles In Grayscale

Beatles In Grayscale

Saving the resulting image

You can also save the resulting image for later use. The code for doing that is:

4. Detecting Edges using OpenCV

Edge Detection is an important operation under object detection. OpenCV makes it very easy for us to detect edges in our images.

We will perform edge detection using the canny edge detector. Canny Edge detection requires a maximum value and a minimum value to carry out edge detection.

Any edges with intensity gradient more than maxVal are sure to be edges and those below minVal are sure to be non-edges and are hence discarded.

You can play around with these two values to increase or decrease the sensitivity of your edge detector.

Here’s the code for detecting edges in your images.

Output :

Python OpenCV

Beatles Edges


This tutorial was an introduction to Computer Vision and OpenCV in Python. We learned how to read and save images using OpenCV. We also covered some basic image processing operations that you can perform using OpenCV. To know more about OpenCV, refer to its official website.

By admin

Leave a Reply

%d bloggers like this: