Raspberry Pi Computer Vision Programming - Sample Chapter
Raspberry Pi Computer Vision Programming - Sample Chapter
ee
Sa
pl
e
Raspberry Pi Computer
Vision Programming
Raspberry Pi was developed as a low-cost single-board computer with the
intention of promoting computer science education in schools. It also represents
a welcome return to a simple and fun yet effective way to learn computer science
and programming.
You can use Raspberry Pi to learn and implement concepts in computer vision.
With a $35 Raspberry Pi computer and a USB webcam, anyone can afford
to become a pro in computer vision in no time and build a real-life computer
vision application to impress friends and colleagues.
Colorspaces,
Transformations, and
Thresholds
In our previous chapter, we saw how to perform basic mathematical and logical
operations on images. We also saw how to use these operations to create a film-style
smooth image transitioning effect. In this chapter, we will continue to explore a few
more intriguing computer vision concepts and their applications in the real world.
We will explore the following topics:
Thresholding an image
[ 53 ]
If you remember, in Chapter 2, Working with Images, Webcams, and GUI, we discovered
that OpenCV loads images in BGR format and matplotlib uses the RGB format for
images. So, before displaying an image with matplotlib, we need to convert an image
from BGR to RGB colorspace.
Take a look at the following code. The program reads the image in color mode using
cv2.imread(), which imports the image in the BGR colorspace. Then, it converts it
to RGB using cv2.cvtColor(), and finally, it uses matplotlib to display the image:
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/book/test_set/4.2.07.tiff',1)
img = cv2.cvtColor ( img , cv2.COLOR_BGR2RGB )
plt.imshow ( img ) , plt.title ('COLOR IMAGE'),
plt.xticks([]) , plt.yticks([])
plt.show()
Another way to convert an image from BGR to RGB is to first split the image into
three separate channels (B, G, and R channels) and merge them in BGR order.
However, this takes more time as split and merge operations are inherently
computationally costly, making them slower and inefficient. So, for the remainder
of this book, we will use the first method. The following code shows this method:
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/book/test_set/4.2.07.tiff',1)
b,g,r = cv2.split ( img )
img=cv2.merge((r,g,b))
plt.imshow ( img ) , plt.title ('COLOR IMAGE'), plt.xticks([]) , plt.
yticks([])
plt.show()
[ 54 ]
Chapter 4
The output of both the programs is the same as shown in the following image:
If you need to know the colorspace conversion flags, then the following snippet of
code will assist you in finding the list of available flags for your current OpenCV
installation:
import cv2
j=0
for filename in dir(cv2):
if filename.startswith('COLOR_'):
print filename
j=j+1
print 'There are ' + str(j) + ' Colorspace Conversion
flags in OpenCV'
[ 55 ]
The last few lines of the output will be as follows (I am not including the complete
output due to space limitation):
.
.
.
COLOR_YUV420P2BGRA
COLOR_YUV420P2GRAY
COLOR_YUV420P2RGB
COLOR_YUV420P2RGBA
COLOR_YUV420SP2BGR
COLOR_YUV420SP2BGRA
COLOR_YUV420SP2GRAY
COLOR_YUV420SP2RGB
COLOR_YUV420SP2RGBA
There are 176 Colorspace Conversion flags in OpenCV
The following code converts a color from BGR to HSV and prints it:
>>> import cv2
>>> import numpy as np
>>> c = cv2.cvtColor(np.uint8[[[255,0,0]]]),cv2.COLOR_BGR2HSV)
>>> print c
[[[120 255 255]]]
The preceding snippet of code prints the HSV value of the color blue represented
in BGR.
Hue, Saturation, Value, or HSV is a color model that describes colors (hue or tint)
in terms of their shade (saturation or amount of gray) and their brightness (value or
luminance). Hue is expressed as a number representing hues of red, yellow, green,
cyan, blue, and magenta. Saturation is the amount of gray in the color. Value works
in conjunction with saturation and describes the brightness or intensity of the color.
Chapter 4
We can use bitwise_and() to extract the color range we're interested in using this
binary mask thereafter. Take a look at the following code to understand this concept:
import numpy as np
import cv2
cam = cv2.VideoCapture(0)
while ( True ):
ret, frame = cam.read()
hsv=cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
image_mask=cv2.inRange(hsv,np.array([40,50,50]),
np.array([80,255,255]))
output=cv2.bitwise_and(frame,frame,mask=image_mask)
cv2.imshow('Original',frame)
cv2.imshow('Output',output)
if cv2.waitKey(1) == 27:
break
cv2.destroyAllWindows()
cam.release()
We're tracking the green colored objects in this program. The output should be
similar to the following one. I used green tea bag tags as the test object.
[ 57 ]
The mask image is not included in the preceding image. You can see it yourself by
adding cv2.imshow('Image Mask',image_mask) to the code. It would be a binary
(pure black and white) image.
We can also track multiple colors by tweaking this code a bit. We need to modify
the preceding code by creating a mask for another color range. Then, we can use
cv2.add() to get the combined mask for two distinct color ranges, as follows:
blue=cv2.inRange(hsv,np.array([100,50,50]),np.array([140,255,255]))
green=cv2.inRange(hsv,np.array([40,50,50]),np.array([80,255,255]))
image_mask=cv2.add(blue,green)
output=cv2.bitwise_and(frame,frame,mask=image_mask)
Image transformations
In this section, we will see the various transformations on an image, and how to
implement them in OpenCV.
Scaling
Scaling is the resizing of the image, which can be accomplished by the cv2.resize()
function. It takes image, scaling factor, and interpolation method as inputs.
The interpolation method parameter can have any one of the following values:
neighbourhood
The following example shows the usage for upscaling and downscaling:
import cv2
img = cv2.imread('/home/pi/book/test_set/house.tiff',1)
upscale = cv2.resize(img,None,fx=1.5,fy=1.5,
interpolation=cv2.INTER_CUBIC)
[ 58 ]
Chapter 4
downscale = cv2.resize(img,None,fx=0.5,fy=0.5,
interpolation=cv2.INTER_AREA)
cv2.imshow('upscale',UpScale)
cv2.waitKey(0)
cv2.imshow('downscale',DownScale)
cv2.waitKey(0)
cv2.destroyAllWindows()
In the preceding code, we upscale the image in the x and y axes with a factor
of 1.5 and downscale in x and y axes with a factor of 0.5. Run the code and see
the output for yourself.
The following code shifts the location of the image with (-50,50):
import numpy as np
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/book/test_set/house.tiff',1)
input=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
[ 59 ]
Some parts of the image will be cropped as the size of the output is the same
as the input.
Similarly, we can use cv2.warpAffine() to apply scaled rotation to an image. For
this, we need to define a rotation matrix with the use of cv2.getRotationMatrix2D(),
which accepts the center of the rotation, the angle of anti-clockwise rotation
(in degrees), and the scale as parameters, and provides a rotation matrix,
which can be specified as the parameter to cv2.warpAffine().
[ 60 ]
Chapter 4
The following example rotates the image by 45 degrees with the center of the image
as the center of rotation, and scales it down to 50% of the original image:
import cv2
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/book/test_set/house.tiff',1)
input=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
rows,cols,channel = img.shape
R = cv2.getRotationMatrix2D((cols/2,rows/2),45,0.5)
output = cv2.warpAffine(input,R,(cols,rows))
plt.imshow ( output ) , plt.title ('Rotated and Downscaled Image')
plt.show()
[ 61 ]
We can create some animation / visual effects by changing the rotation angle at
regular intervals and then displaying it in a continuous loop till the Esc key is
pressed. Following is the code for this (check the output yourself):
import cv2
from time import sleep
image = cv2.imread('/home/pi/book/test_set/house.tiff',1)
rows,cols,channels = image.shape
angle = 0
while(1):
if angle == 360:
angle=0
M = cv2.getRotationMatrix2D((cols/2,rows/2),angle,1)
rotated = cv2.warpAffine(image,M,(cols,rows))
cv2.imshow('Rotating Image',rotated)
angle=angle+1
sleep(0.2)
if cv2.waitKey(1) == 27 :
break
cv2.destroyAllWindows()
[ 62 ]
Chapter 4
image = cv2.imread('/home/pi/book/test_set/2.1.11.tiff',1)
#changing the colorspace from BGR->RGB
input = cv2.cvtColor(image, cv2.COLOR_BGR2RGB )
rows,cols,channels = input.shape
points1 = np.float32([[100,100],[300,100],[100,300]])
points2 = np.float32([[200,150],[400,150],[100,300]])
A = cv2.getAffineTransform(points1,points2)
output = cv2.warpAffine(input,A,(cols,rows))
plt.subplot(121),plt.imshow(input),plt.title('Input')
plt.subplot(122),plt.imshow(output),plt.title('Affine Output')
plt.show()
[ 63 ]
Perspective transformation
In perspective transformation, we provide four points from the input image and
corresponding four points in the output image. The condition is that any three
of these points should not be collinear (again, not in the same line). Like affine
transformation, in perspective transformation, a straight line will remain straight.
However, the parallelism between the lines will not be preserved. A real-life example
of perspective transformation would be the zooming and angled zoom functionality
in software. The degree and angle of zoom depends on the transformation matrix,
which is defined by a set of four input and four output points. Let's see an example
of the simple zoom functionality with the following code, where we use cv2.
getPerspectiveTransform() to generate the transformation matrix and
cv2.warpPerspective() to get the transformed output:
import cv2
import numpy as np
from matplotlib import pyplot as plt
image = cv2.imread('/home/pi/book/test_set/ruler.512.tiff',1)
#changing the colorspace from BGR->RGB
input = cv2.cvtColor(image, cv2.COLOR_BGR2RGB )
rows,cols,channels = input.shape
points1 = np.float32([[0,0],[400,0],[0,400],[400,400]])
points2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
P = cv2.getPerspectiveTransform(points1,points2)
output = cv2.warpPerspective(input,P,(300,300))
plt.subplot(121),plt.imshow(input),plt.title('Input')
plt.subplot(122),plt.imshow(output),plt.title('Perspective
Transform')
plt.show()
[ 64 ]
Chapter 4
Try passing various combination of the parameters to see how the resultant
image changes. In the preceding example, parallelism between the lines is preserved
because of the combination of the parameters we used. You might want to try
different combinations of the parameters to see that the parallelism between the
lines is not preserved.
[ 65 ]
Thresholding image
Thresholding is the simplest way to segment images. Although thresholding
methods and algorithms are available for colored images, it works best on
grayscale images. Thresholding usually (but not always) converts grayscale images
into binary images (in a binary image, each pixel can only have one of two possible
values: white or black). Thresholding the image is usually the first step in many
image processing applications.
The way thresholding works is very simple. We define a threshold value. For a pixel
in a grayscale image, if the value of grayscale intensity is greater than the threshold,
we assign a value to the pixel (for example, white), else we assign a black value to the
pixel. This is the simplest form of thresholding and there are many other variations
of this method, which we will see now.
In OpenCV, the cv2.threshold() function is used to threshold images. It takes
as input, grayscale image, threshold value, maxVal, and threshold method as
parameters, and returns the thresholded image as output. The maxVal parameter
is the value assigned to the pixel if the pixel intensity is greater (or less in some
methods) than the threshold. There are five threshold methods available in OpenCV;
in the beginning, the simplest form of thresholding we saw is cv2.THRESH_BINARY.
Let's see the mathematical representation of all the threshold methods.
Say (x,y) is the input pixel; then, operations by threshold methods are as follows:
cv2.THRESH_BINARY
cv2.THRESH_BINARY_INV
cv2.THRESH_TRUNC
[ 66 ]
Chapter 4
cv2.THRESH_TOZERO
cv2.THRESH_TOZERO_INV
[ 67 ]
Otsu's method
Otsu's method for thresholding automatically determines the value of the threshold
for the images, which have two peaks in their histogram (bi-modal histograms). This
usually means the image has background and foreground pixels, and Otsu's method
is the best way to separate these two sets of pixels automatically without specifying
the threshold value.
[ 68 ]
Chapter 4
Otsu's method is not the best way for images which are not in the background
+ foreground model, and they may provide improper output if applied. This
method is applied in addition to other methods and the threshold is passed
as 0. Try implementing the following code:
ret,output=cv2.threshold(image,0,255,cv2.
THRESH_BINARY+cv2.THRESH_OTSU)
Exercise
Explore cv2.adaptiveThreshold(), which is used for adaptive thresholding of
images based on uneven lighting conditions (some parts of the image are more
illuminated than others).
[ 69 ]
Summary
In this chapter, we explored colorspaces and its applications in image tracking
with one color and multiple colors. Then, we applied transformations on images.
Finally, we saw how to threshold an image.
In the next chapter, we will go over the basics of noise and filtering the noise,
as well as smoothening/blurring images with Low Pass Filters.
[ 70 ]
www.PacktPub.com
Stay Connected: