In this topic, we’ll cover the Python OpenCV library in complete detail. Computer Vision refers to the field of study which deals with how computers perceive images. It involves feeding images into a computer and then trying to gain high-level intelligence from it using different algorithms.
It works in close coordination with fields like Machine Learning and Artificial Intelligence. Computer Vision is a broad field and is progressing rapidly.
Computer Vision has a variety of real-world applications:
- Object detection
- Facial Recognition
- Self-Driving Cars
- Cancer-Detection
One of the popular tasks under the broad field of Computer Vision is Image Processing.
Image processing involves performing some operations on an image, in order to get an enhanced image or to extract some useful information from it.
A major part of object detection is solved using Convolution Neural Networks.
What is a Convolution Neural Network?
A convolutional neural network is a class of deep neural networks that can analyze image data. It is able to draw useful high-level information from image data. These networks can be trained for recognizing objects, facial features, handwriting, and image classification.
A Convolutional Neural Network usually contains a combination of the following layers.
- Convolutional Layers
- Pooling Layers
- Flattening Layers.
Let’s briefly discuss about these layers.
1. Convolution Layer
The convolution layer filters the image with a smaller pixel filter. This decreases the size of the image without losing the relationship between pixels.
2. Pooling Layer
The main job of the pooling layer is to reduce the spatial size of the image after convolution.
A pooling layer reduces the amount of parameters by selecting the maximum, average, or sum values inside the pixels.
Max pooling is the most commonly used pooling technique.
3. Flattening Layer
A flattening layer represents the multi-dimensional pixel vector as a one-dimensional pixel vector.
When it comes to Python, OpenCV is the library that offers the best image processing tools.
In this tutorial, we will learn how to read images into Python using OpenCV. We will also look at some basic image processing operations.
What is OpenCV?
OpenCV is a library of programming functions mainly aimed at real-time computer vision.
Apart from importing and saving images, OpenCV also provides image processing operations such as edge detection, segmentation, Morphological operations and lots more. We will cover some of these operations in this tutorial.
Before we move any further, let’s install OpenCV onto our system.
1. Installing OpenCV
To install OpenCV use the pip command as shown below :
1 2 |
pip install opencv-python <img class="alignnone wp-image-28208 size-full" src="http://all-learning.com/wp-content/uploads/2020/11/The-Beatles2.png" alt="The Beatles" width="1200" height="628" /> |
Once you are done with the installation, you can get started with importing an image using OpenCV.
2. How to read images using Python OpenCV?
Let’s select a sample picture that we can import using OpenCV.
We are going to use this very popular image of ‘The Beatles‘.
To read this image using OpenCV use :
1 2 |
import cv2 img = cv2.imread('beatles.jpg') |
This will store the image in the variable ‘img‘. Let’s see what happens when we print this variable.
1 2 3 |
import cv2 img = cv2.imread('beatles.jpg') print(img) |
Output:
We get a matrix as the output because this is how your computer perceives an image.
For a computer an image is just a collection of pixel values.
A digital image is stored as a combination of pixels in a machine. Each pixel further contains a different number of channels. If it a grayscale image, it has only one channel, whereas a colored image contains three channels: red, green, and blue. Each channel of each pixel has a value between 0 and 255.
These pixel values together make the image, which we then perceive as ‘The Beatles‘.
Let’s learn some image processing operations now.
3. Convert an Image to Grayscale using OpenCV
In this section we will convert our sample image to grayscale and the display it.
1 2 3 4 5 6 7 8 |
import cv2 img = cv2.imread('beatles.jpg') gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #show print(gray_image) cv2.imshow('image',gray_image) cv2.waitKey(0) cv2.destroyAllWindows() |
This piece of code will first convert the image into grayscale. The line of code responsible for doing that is :
1 |
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) |
It will then print the image matrix and display the resulting image.
The code for displaying any image is:
1 2 3 |
cv2.imshow('image',gray_image) cv2.waitKey(0) cv2.destroyAllWindows() |
Output :
Saving the resulting image
You can also save the resulting image for later use. The code for doing that is:
1 2 3 4 |
import cv2 img = cv2.imread('beatles.jpg') gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) cv2.imwrite('sample_grayscale.jpg',gray_image) |
4. Detecting Edges using OpenCV
Edge Detection is an important operation under object detection. OpenCV makes it very easy for us to detect edges in our images.
We will perform edge detection using the canny edge detector. Canny Edge detection requires a maximum value and a minimum value to carry out edge detection.
Any edges with intensity gradient more than maxVal are sure to be edges and those below minVal are sure to be non-edges and are hence discarded.
You can play around with these two values to increase or decrease the sensitivity of your edge detector.
Here’s the code for detecting edges in your images.
1 2 3 4 5 6 |
import cv2 img = cv2.imread('beatles.jpg') edges = cv2.Canny(img,50,300) cv2.imshow('image',edges) cv2.waitKey(0) cv2.destroyAllWindows() |
Output :
Conclusion
This tutorial was an introduction to Computer Vision and OpenCV in Python. We learned how to read and save images using OpenCV. We also covered some basic image processing operations that you can perform using OpenCV. To know more about OpenCV, refer to its official website.