Computer Vision

5.
COMPUTER VISION
Let us experience the domain CV with the help of the following game:
Emoji Scavenger Hunt : https://emojiscavengerhunt.withgoogle.com
Go to the link and try to play the game of Emoji Scavenger Hunt. The challenge here is to find 8 items within the
time limit to pass.
LO: Define CV ,its applications and Open CV library.
Computer Vision:
The Computer Vision domain of Artificial Intelligence, enables machines to see
through images or visual data, process and analyse them on the basis of algorithms
and methods in order to analyse actual phenomena with images.

Applications of Computer Vision:
1. Facial Recognition: With the advent of smart cities and smart homes, Computer
Vision plays a vital role in making the home smarter. Security being the most important
application involves use of Computer Vision for facial recognition. It can be either guest
recognition or log maintenance of the visitors. It also finds its application in schools for
an attendance system based on facial recognition of students.
2. Face Filters: The modern-day apps like Instagram and snapchat have a lot of
features based on the usage of computer vision. The application of face filters is
one among them. Through the camera the machine or the algorithm is able to
identify the facial dynamics of the person and applies the facial filter selected.
3.Google’s Search by Image: The maximum amount of searching for data

on Google’s search engine comes from textual data, but at the same time it
has an interesting feature of getting search results through an image. This
uses Computer Vision as it compares different features of the input image to
the database of images and give us the search result while at the same time
analysing various features of the image.
4. Computer Vision in Retail: The retail field has been one of the fastest
growing field and at the same time is using Computer Vision for making the user
experience more fruitful. Retailers can use Computer Vision techniques to track
customers’ movements through stores, analyse navigational routes and detect
walking patterns. Inventory Management is another such application. Through
security camera image analysis, a Computer Vision algorithm can generate a very
accurate estimate of the items available in the store. Also, it can analyse the use
of shelf space to identify suboptimal configurations and suggest better item
placement.
5.Self-Driving Cars: Computer Vision is the fundamental technology behind

developing autonomous vehicles. Most leading car manufacturers in the world
are reaping the benefits of investing in artificial intelligence for developing on-
road versions of hands-free technology. This involves the process of identifying
the objects, getting navigational routes and also at the same time environment
monitoring.
6. Medical Imaging: For the last decades, computer supported medical imaging
application has been a trustworthy help for physicians. It doesn’t only create and
analyse images, but also becomes an assistant and helps doctors with their
interpretation. The application is used to read and convert 2D scan images into
interactive 3D models that enable medical professionals to gain a detailed
understanding of a patient’s health condition.
7.Google Translate App: All you need to do to read signs in a foreign language
is to point your phone’s camera at the words and let the Google Translate app
tell you what it means in your preferred language almost instantly. By using
optical character recognition to see the image and augmented reality to overlay
an accurate translation, this is a convenient tool that uses Computer Vision
DA: Mind map the applications of CV

HW: Research and find the Python library that supports CV.
LO: Understand CV Tasks and basics of images.
LO: Understand CV Tasks and basics of images.
Computer Vision Tasks:
The various applications of Computer Vision are based on a certain number of tasks which are performed
to get certain information from the input image which can be directly used for prediction or forms the
base for further analysis. The tasks used in a computer vision application are :
1.https://www.youtube.com/watch?v=taC5pMCm70U
2. https://www.youtube.com/watch?v=taC5pMCm70U
(Watch for 5.50 mts)

• Classification :Image Classification problem is the task of assigning an input image one label from a fixed set
of categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of practical
applications.
• Classification + Localisation: This is the task which involves both processes of identifying what object is
present in the image and at the same time identifying at what location that object is present in that image. It
is used only for single objects.
• Object Detection :Object detection is the process of finding instances of real-world objects such as faces,
bicycles, and buildings in images or videos. Object detection algorithms typically use extracted features and
learning algorithms to recognize instances of an object category. It is commonly used in applications such as
image retrieval and automated vehicle parking systems.
• Instance Segmentation :Instance Segmentation is the process of detecting instances of the objects, giving
them a category and then giving each pixel a label on the basis of that. A segmentation algorithm takes an
image as input and outputs a collection of regions (or segments).
Basics of Images:
• Basics of Pixels :The word “pixel” means a picture element. Every photograph, in digital form, is made up
of pixels. They are the smallest unit of information that make up a picture. Usually round or square, they
are typically arranged in a 2-dimensional grid. In the image below, one portion has been magnified many
times over so that you can see its individual composition in pixels. As you can see, the pixels approximate
the actual image. The more pixels you have, the more closely the image resembles the original
• Resolution :The number of pixels in an image is sometimes called the resolution. When the term is used
to describe pixel count, one convention is to express resolution as the width by the height, for example a
monitor resolution of 1280×1024. This means there are 1280 pixels from one side to the other, and 1024
from top to bottom. Another convention is to express the number of pixels as a single number, like a 5
mega pixel camera (a megapixel is a million pixels). This means the pixels along the width multiplied by
the pixels along the height of the image taken by the camera equals 5 million pixels. In the case of our
1280×1024 monitors, it could also be expressed as 1280 x 1024 = 1,310,720, or 1.31 megapixels.
• Pixel value: Each of the pixels that represents an image stored inside a computer has a pixel value
which describes how bright that pixel is, and/or what colour it should be. The most common pixel
format is the byte image, where this number is stored as an 8-bit integer giving a range of possible
values from 0 to 255. Typically, zero is to be taken as no colour or black and 255 is taken to be full
colour or white. Why do we have a value of 255 ? In the computer systems, computer data is in the form
of ones and zeros, which we call the binary system. Each bit in a computer system can have either a
zero or a one. Since each pixel uses 1 byte of an image, which is equivalent to 8 bits of data. Since each
bit can have two possible values which tells us that the 8 bit can have 255 possibilities of values which
starts from 0 and ends at 255.
• Grayscale Images
Grayscale images are images which have a range of shades of gray without apparent colour. The darkest
possible shade is black, which is the total absence of colour or zero value of pixel. The lightest possible
shade is white, which is the total presence of colour or 255 value of a pixel . Intermediate shades of gray
are represented by equal brightness levels of the three primary colours. A grayscale has each pixel of size 1
byte having a single plane of 2d array of pixels. The size of a grayscale image is defined as the Height x
Width of that image. Let us look at an image to understand about grayscale images.
Let us look at an image to understand about grayscale images.
Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- 255.
The computers store the images we see in the form of these numbers.
• RGB Images: All the images that we see around are coloured images. These images are made up of three
primary colours Red, Green and Blue. All the colours that are present can be made by combining different
intensities of red, green and blue.
Practice Task
Go to this online link
https://www.w3schools.com/colors/colors_rgb.asp
On the basis of this online tool, try and answer all the below mentioned questions.
1) What is the output colour when you put R=G=B=255 ?
2) What is the output colour when you put R=G=B=0 ?
3) How does the colour vary when you put either of the three as 0 and then keep on varying the other two?
4) How does the output colour change when all the three colours are varied in same proportion ?
5) What is the RGB value of your favourite colour from the colour palette?
Were you able to answer all the questions? If yes, then you would have understood how every colour we see around is
made.
Now the question arises,
How do computers store RGB images?
Every RGB image is stored in the form of three different channels called the R channel, G channel and the B channel.
Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three planes when
combined together form a colour image. This means that in a RGB image, each pixel has a set of three different values
which together give colour to that particular pixel.
As you can see, each colour image is stored in the form of three different channels, each having different
intensity. All three channels combine together to form a colour we see. In the above given image, if we split the
image into three different channels, namely Red (R), Green (G) and Blue (B), the individual layers will have the
following intensity of colours of the individual pixels. These individual layers when stored in the memory looks
like the image on the extreme right. The images look in the grayscale image because each pixel has a value
intensity of 0 to 255 and as studied earlier, 0 is considered as black or no presence of colour and 255 means
white or full presence of colour. These three individual RGB values when combined together form the colour
of each pixel. Therefore, each pixel in the RGB image has three values to form the complete colour.
Home Work:
Go to the following link www.piskelapp.com , create your own pixel art.
Try and make a GIF using the online app for your own pixel art.
CV- Practical
Try the tasks posted for hands on practice .

Computer Vision

Uploaded by

Copyright:

Available Formats

Computer Vision

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Computer Vision

Uploaded by

Copyright:

Available Formats

5.

Emoji Scavenger Hunt : https://emojiscavengerhunt.withgoogle.com

The Computer Vision domain of Artificial Intelligence, enables machines to see

and methods in order to analyse actual phenomena with images.

3.Google’s Search by Image: The maximum amount of searching for data

5.Self-Driving Cars: Computer Vision is the fundamental technology behind

DA: Mind map the applications of CV

(Watch for 5.50 mts)

Let us look at an image to understand about grayscale images.

Go to the following link www.piskelapp.com , create your own pixel art.

Try the tasks posted for hands on practice .

You might also like