Week5_Computer_Vision
Week5_Computer_Vision
of Artificial Intelligence
Human Vision System
2
Robot Vision System
3
Image Formation
4
Image Formation
5
Simple Image Feature
Edge
7
Simple Image Feature
Edge
8
Simple Image Feature
9
Simple Image Feature
Optical Flow: Whenever there is relative movement between the camera and one or more
objects in the scene, the resulting apparent motion in the image is called optical flow.
10
Simple Image Feature
Optical Flow: Whenever there is relative movement between the camera and one or more
objects in the scene, the resulting apparent motion in the image is called optical flow.
11
Simple Image Feature
12
Classifying Images
13
Classifying Images
14
Detecting Objects
15
The 3D World
Binocular stereopsis
16
Using Computer Vision
17
Using Computer Vision
18
Using Computer Vision
19
Using Computer Vision
Visual question-answering
20
Using Computer Vision
21
Using Computer Vision
22
Using Computer Vision
Making pictures
23
Using Computer Vision
24
Using Computer Vision
25
Using Computer Vision
26
Using Computer Vision
27
Using Computer Vision
Navigation
29
Image Analysis
§ Overview of Image Analysis
§ Collecting and Representing Image
§ Image Recognition
§ Bag-of-Visual-Words model
§ Deep Convolutional Neural Networks
Overview of Image Analysis
§ Image analysis
§ Refers to the representation, processing, and modelling of visual data to
derive useful insights
§ Suffers from the semantic gap
§ Visual data (image, video, …) is unstructured
§ Semantic gap
§ The gap between high-level concepts used by human and the low-level
features used by computer
Overview of Image Analysis
§ Image recognition (in a narrow sense)
§ Image classification
§ Object detection, localisation, tracking
§ Scene segmentation and reconstruction
§ Image search and retrieval
Overview of Image Analysis
§ Image classification
http://twd20g.blogspot.com.au/2011/12/this-work-presents-novel-system-that.html https://www.3dflow.net/elementsCV/S4.xhtml
Image Analysis Steps
§ Collection and labelling
§ Collect representative images from a given task and label the ground
truth
§ Image representation
§ Select and/or design appropriate image representations (invariant and
discriminative)
§ Image analysis techniques
§ Apply and/or design appropriate analysis techniques for the given tasks
(classification, detection, tracking, segmentation, etc.)
Representing Image
§ Why representing images is difficult?
§ Scale, rotation, illumination, occlusion, background clutter, deformation, …
§ Invariant and Discriminative representation
Cat:
Representing Image
§ Traditional representation (before year 2000)
§ Hand-crafted, global features
§ Intensity, colour, texture, shape, structure, etc.
http://www.robots.ox.ac.uk/~vgg/software
/ Image courtesy of David Lowe, IJCV04
Deep Learning Model
Convolutional Neural Networks (CNNs)
§ A special multi-stage architecture inspired by visual system
§ Higher stages compute more global, more invariant features
Deep Learning Model
https://www.datasciencecentral.com/lenet-5-a-classic-cnn-architecture/
Convolution
Filter
§ The stride is 1.
§ The height and width are changed as:
&'( )&*'+,-.
!"#$ = + 1 = (5 − 3)⁄1 + 1 = 3.
/$0123
Convolution
Kernel 1
…
Kernel N
Feature map 1
…
Feature map N
!×#×$%& !×#×$'()
Convolutional Neural Networks
§ Multi-stage Architecture
Convolution
Non-linearity
Pooling
Convolutional Neural Networks
Convolution
- A set of filters convolve with the input
- Share weights across the input space (translation equivariance)
Input
Filters
Feature Map
Convolutional Neural Networks
Non-linearity
Spatial pooling
§ Non-overlapping / overlapping regions
§ Max or sum
§ Invariance to small transformations
Max pooling
Sum/Average
pooling
Deep Learning Model
CNNs: ImageNet Breakthrough
Object detection (Source: Rich feature hierarchies for accurate object detection and semantic
segmentation, CVPR 2014)
Face Recognition (Source: DeepFace: Closing the Gap to Human-Level Performance in Face Verification,
CVPR 2014)
Deep Learning Model
Fine-
tune