Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 84

CPSC 425: Computer Vision

Image Credit: Devi Parikh

Lecture 1: Introduction and Course Logistics

1
Course logistics
Times: Mon, Wed, Fri 12:00pm-1250pm

Instructor: Jim Little

E-mail: little@cs.ubc.ca

Course webpage: http://www.cs.ubc.ca/~little/425.html


Discussion: Piazza - enroll2through Canvas or web page link.
On-line Etiquette
Times: Mon, Wed, Fri 12:00pm-1250pm Locations: Online (Zoom)

Keep your microphones muted, unless you are asking a question

Raise your hand (in Zoom) if you want to ask a question, I will call on you (possibly
not immediately), and then you can unmute and ask it, then mute again

If you don’t have a microphone, you can ask a question in Chat, but I prefer spoken
questions.

Course webpage: https://www.cs.ubc.ca/~little/425.html


3
About me … Professor
1988 -
I have been working in
Computer Vision for Research Scientist
the last ~40 years 1985-1988

PhD, MSc
1978 - 1985

Research Associate
1975-1978

Research Analyst
1972 - 1975
4
Course logistics
Times: Mon, Wed, Fri 12:00pm-1250pm Location: Online (Zoom)

TAs: Rayat Hossain Won Bae Bicheng Xu


Instructor: Jim Little

rayat137@cs.ubc.ca whbae@cs.ubc.ca
bichengx@cs.ubc.ca
Tanzila Rahman Gabriel Huang

E-mail: little@cs.ubc.ca
Office: ICICS 117
trahman8@cs.ubc.ca Gabrie20@cs.ubc.ca

Course webpage: http://www.cs.ubc.ca/~little/425.html

5
Course logistics
Times: Mon, Wed, Fri 12-1250pm Locations: Online (Zoom)

Lectures will be on Zoom. Lectures will be recorded and


made available on Canvas. Contact me if you don’t have
access to Canvas (little@cs.ubc.ca)

TA and Office hours: Zoom

Course webpage: http://www.cs.ubc.ca/~lsigal/teaching.html


Discussion: piazza.com/ubc.ca/winterterm12018/cpsc425
6
Course logistics
Times: Mon, Wed, Fri 12-1250pm Location: online

Use Piazza for any questions related to


material and assignments in the course

If you have a question, I can guarantee you that


at least 10 students in the course have an
identical question.

Course webpage: http://www.cs.ubc.ca/~little/425.html

7
Course logistics
Times: Mon, Wed, Fri 12:00pm-1250pm Locations: Online (Zoom)

I will use Canvas for assignment submission and grading.

Course webpage: http://www.cs.ubc.ca/~lsigal/teaching.html


Discussion: piazza.com/ubc.ca/winterterm12018/cpsc425
8
Course logistic
Times: Mon, Wed, Fri 12:00pm-1250pm Locations: Online (Zoom)

I will use the Canvas and the Course Webpage for assignment
(Canvas) and lecture slide distribution (both).

I will post slides before each lecture, so you can take notes over
them if you wish.

Course webpage: http://www.cs.ubc.ca/~lsigal/teaching.html


Discussion: piazza.com/ubc.ca/winterterm12018/cpsc425
9
Course logistic
Times: Mon, Wed, Fri 12:00pm-1250pm Locations: Online (Zoom)

Lectures (Live: Zoom; Recorded: Canvas; Slides: Canvas & Web Page)
Office and TA hours (Zoom)
Assignments (Instructions: Web Page & Canvas; Handin: Canvas)
Assigned Readings (Web Page)
Schedule (Web Page)
Questions & Assignment Support (Piazza)

Course webpage: http://www.cs.ubc.ca/~lsigal/teaching.html


Discussion: piazza.com/ubc.ca/winterterm12018/cpsc425
1
0
Topics Covered

– Image Processing (Linear Filtering, Convolution)


– Filters as Templates
– Image Feature Detection (Edges & Corners)
– Texture & Colour
– Image Feature Description (SIFT)
– Model Fitting (RANSAC, The Hough Transform)
– Camera Models, Stereo Geometry
– Motion and Optical Flow
– Clustering and Image Segmentation
– Learning and Image Classification
– Deep Learning Introduction

1
1
Course Origins
CPSC 425 was originally developed by David Lowe, then Bob Woodham, then Fred
Tung and Leon Sigal, and has evolved over the years.
Previously taught by:
— 2021-2022 Term 1 by Jim Little
— 2020-2021 Term 2 by Jim Little
— 2019-2020 Term 2 by Leon Sigal
— 2019-2020 Term 1 by Jim Little
— 2018-2019 Term 2 by Leon Sigal
— 2018-2019 Term 1&2 by Leon Sigal
— 2017-2018 Term 2 by Leon Sigal
— 2016-2017 Term 2 by Jim Little
— 2015-2016 Term 2 by Fred Tung
— 2015-2015 Term 2 by Jim Little

1
2
Course Origins

The course is a very broad, but relatively shallow introduction to a very diverse
and complex field that draws material from geometry, statistics, AI, machine
learning, computer graphics, psychology and many others.
— This means we will cover many topics and different algorithms.
— I will give you as much background and connective tissue as I can
… but, there is no “linear” way to learn the material we will cover
… I will not be able to go into depth on some of the topics

1
3
How to do Well in the Course?

— It is easy to think that material is easy and course requires no studying


— Part of your job should be going over the slides and carefully analyzing not just
what is on them, but the underlying assumptions, algorithmic steps and so on
— Don’t strive for “template matching” strive for true “understanding”

1
4
How to do Well in the Course?

— It is easy to think that material is easy and course requires no studying


— Part of your job should be going over the slides and carefully analyzing not just
what is on them, but the underlying assumptions, algorithmic steps and so on
— Don’t strive for “template matching” strive for true “understanding”

— Some topics we will cover are theoretic and fundamental (e.g., geometry)
— Others are algorithmic (i.e., you make certain assumptions about the world, these
assumptions may not always hold, but will be useful in building algorithms that
ultimately perform well on a prescribed task)
— Computer vision is more of an experimental science - ultimately we are looking at
performance to determine whether our algorithmic choices are successful.
1
5
Grading Criteria

Online Quizzes: 10%

Programming Assignments: 45%

6 graded and 1 ungraded (optional) assignment

Midterm Exam (TBD): 15%

Final Exam (TBD): 30%


1
6
Grading Criteria
You do NOT need to pass the final to pass the course

Online Quizzes: 10%

Programming Assignments: 45%

6 graded and 1 ungraded (optional) assignment

Midterm Exam (TBD): 15%

Final Exam (TBD): 30%


1
7
Quizzes

Will be made available on Canvas for a 24 hour window

Number of quizzes has not been determined and each quiz may have
different number of questions / points.

Quizzes are designed to get you to think more deeply about what we are
covering and to keep you on track with the material.
Assignments Due (tentative) dates are already posted (so you can plan ahead)
There will be 7 assignments in total (6 marked)
— Approximately 1 every 2 weeks (two are 1.5 weeks)
— You will hand these in by 11:59pm on the due date (read hand-in instructions and late
policy on course webpage)

You will use the Python, with the following libraries: Python

Imaging Library (PIL), NumPy, Matplotlib, SciPy, Scikit-Learn

— Assignment 0 (which is ungraded) will introduce you to this.

Assignments contribute 45% to your final score (equally distributed)


2
0
Midterm Exam

[ Tentatively ] on TBD
— Online, during the lecture period
— Closed book, no notes allowed

Multiple choice, true / false and short answer questions


— Aimed to test your “understanding” of the content of the course

The Midterm exam will contribute 15% to your final score

2
1
Final Exam

The Final exam is held during the regular examination period and is scheduled by the
Registrar’s Office

Similar to the midterm but longer and with more extensive short/medium answer
questions

The Final exam will contribute 30% to your final score

2
2
Textbooks
The course uses the following textbook, which is recommended (but not required):
Can be freely downloaded as a PDF from SpringerLink,

through UBC Library Website (must login using CWL).

Computer Vision: A Modern Approach


Computer Vision: Algorithms and
(2nd edition) Applications
By: D. Forsyth & J. Ponce By: R. Szeliski
Publisher: Pearson Publisher: Springer
Pub. Date: 2012 2
Pub. Date: 2010
3
Readings

You will be assigned readings.


— Sometimes you will be assigned readings from other sources

Skim the reading before coming to the lecture – read in detail after.
— Reading assignments will be posted on the course webpage
— They will also be mentioned in class

2
4
Computer Vision!
How important is Vision?

To answer this questions, we need to go


back to about

…. 543 million years, B.C.

2
6
How important is Vision?

To answer this questions, we need to go


back to about

…. 543 million years, B.C.

Vision is really fundamental to life and evolution

2
7
What is Computer Vision?

Image Credit: https://www.deviantart.com/infinitecreations/art/BioMech-Eye-168367549

2
8
What is Computer Vision?
Computer vision, broadly speaking, is a research field aimed to enable computers to
process and interpret visual data, as sighted humans can.

Image Credit: https://www.deviantart.com/infinitecreations/art/BioMech-Eye-168367549

2
9
What do you see?

3 Slide Credit: Jitendra Malik (UC Berkeley)


0
What we would like computer to infer?

3 Slide Credit: Jitendra Malik (UC Berkeley)


1
What we would like computer to infer?
Will person B put some money into person C’s cup?

3 Slide Credit: Jitendra Malik (UC Berkeley)


2
What is Computer Vision?
Computer vision, broadly speaking, is a research field aimed to enable computers to
process and interpret visual data, as sighted humans can.

Sensing Device Interpreting Device

Image (or video) Interpretation

Image Credit: https://www.flickr.com/photos/flamephoenix1991/8376271918


blue sky,
trees,
fountains,
UBC, …

3
3
What is Computer Vision?
Compute vision, broadly speaking, is a research field aimed to enable computers to
process and interpret visual data, as sighted humans can.

Sensing Device
Interpreting Device
Image (or video) Interpretation
blue sky,
trees,
Image Credit: https://www.flickr.com/photos/flamephoenix1991/8376271918

fountains
UBC, …

3
4
Computer vision … the beginning …

“spend the summer linking a camera to


a computer and getting the computer
to describe what it saw”

- Marvin Minsky (1966), MIT


Turing Award (1969)
… >50 years later

Slide Credit: Devi Parikh (GA Tech)


3
5
Computer vision … the beginning …

Gerald Sussman, MIT

“You’ll notice that Sussman never


worked in vision again!” – Berthold Horn

Slide Credit: Devi Parikh (GA Tech)


3
6
Can computers match (or beat) human vision?

• We’ve been at it for 50 years

3
7
Can computers match (or beat) human vision?

• How good is human vision?

3
8
Can computers match (or beat) human vision?

3
9
Can computers match (or beat) human vision?

4
0
Can computers match (or beat) human vision?
• Yes and No (mostly NO)
• The shading example shows that human vision does not “see” what’s
in the image. Rather we see the scene and “explain” what is in the
image using prior knowledge and experience.

4
1
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

Slide Credit: Kristen Grauman (UT Austin) 4


2
1. Vision for Measurement

Real-time stereo Structure from motion Tracking

NASA Mars Rover

Snavely et al. Demirdjian et al.

Wang et al.

Slide Credit: Kristen Grauman (UT Austin) 4


3
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

Ill-posed problem: real world is much more complex than


what we can measure in images: 3D -> 2D

It is (literally) impossible to invert the image formation process

Slide Credit: Kristen Grauman (UT Austin) 4


4
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

Slide Credit: Kristen Grauman (UT Austin) 4


5
2. Vision for Perception and Interpretation

46

Slide Credit: Kristen Grauman (UT Austin)


2. Vision for Perception and Interpretation
sky
amusement park
Objects
The Wicked
Activities
Cedar Point
Twister Scenes
ride Ferris wheel Locations
47 ride
12 E Text /
Lake Erie water
tree
ride writing
tree
people waiting in line
Faces
people sitting on ride Gestures
tree
umbrellas
maxair Motions
carousel
deck
bench tree pedestrians
Emotions
Slide Credit: Kristen Grauman (UT Austin)

Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

It is computationally intensive / expensive

Slide Credit: Kristen Grauman (UT Austin) 4


8
2. Vision for Perception and Interpretation
~ 55% of cerebral cortex in humans (13 billion neurons) are devoted to vision
more of the human brain is devoted to vision than anything else

4
9
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

It is computationally intensive / expensive

We do not (fully) understand the processing mechanisms involved

Slide Credit: Kristen Grauman (UT Austin) 5


0
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

3. Algorithms to mine, search, and interact with visual data (search and
organization)

Slide Credit: Kristen Grauman (UT Austin) 5


1
3. Search and Organization

Query Image or video archives Relevant content

Slide Credit: Kristen Grauman (UT Austin) 5


2
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

3. Algorithms to mine, search, and interact with visual data (search and
organization)

Scale is enormous, explosion of visual content

Slide Credit: Kristen Grauman (UT Austin) 5


3
3. Search and Organization

Snapchat WhatsApp Facebook

31.7 Million 29.2 Million 14.6 Million


/ hour / hour / hour

Instagram Flickr

*from iStock by GettyImages

2.9 Million 0.2 Million 18K hours


/ hour / hour / hour

5 *based on article by Kimberlee Morrison in Social Times (2015)

4
3. Search and Organization

Snapchat WhatsApp Facebook

31.7 Million 29.2 Million 14.6 Million


/ hour / hour / hour

> 85% of all web content is multimedia content of visual form


Instagram Flickr

*from iStock by GettyImages

2.9 Million 0.2 Million 18K hours


/ hour / hour / hour

5 *based on article by Kimberlee Morrison in Social Times (2015)

5
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

3. Algorithms to mine, search, and interact with visual data (search and
organization)

4. Algorithms for manipulation or creation of image or video content


(visual imagination)

Slide Credit: Kristen Grauman (UT Austin) 5


6
4. Visual Imagination

He et al. ECCV 2018

Zhao et al. ECCV 2018

5
7
4. Visual Imagination

5
8
Computer Vision Problems
1. Computing properties of the 3D world from visual data (measurement)

2. Algorithms and representations to allow a machine to recognize objects,


people, scenes, and activities (perception and interpretation)

3. Algorithms to mine, search, and interact with visual data (search and
organization)

4. Algorithms for manipulation or creation of image or video content


(visual imagination)

Slide Credit: Kristen Grauman (UT Austin) 6


0
Can computers match (or beat) human vision?

• Yes and No (mostly NO)

• Let’s see some examples of state-of-the-art and where it is used

7
1
Face Detection
Technology available in any digital camera now
(one of the first big commercial successes of vision algorithms)

7
3
Smile Detection

Sony Cyber-shot® T70 Digital Still Camera

7
4
Face Recognition

Apple’s iPhoto

Facebook

http://www.apple.com/ilife/iphoto/

Slide Credit: Devi Parikh (GA Tech) and Fei-Fei Li (Stanford)


7
5
Vision for Biometrics

7
6
Vision for Biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story wikipedia

Slide Credit: James Hays (GA Tech) 7


7
Vision for Biometrics
iPhone X Face ID

Face recognition systems are not part of widely


used technologies

Fingerprint scanners on many How it works and how to fool it:


https://www.youtube.com/watch?v=FhbMLmsCax0

new laptops,
other devices
Image Credit: James Hays (GA Tech)
7
8
Object Recognition (in supermarkets)

https://www.youtube.com/watch?v=NrmMk1Myrxc
7
9
Object Recognition (in mobile devices)

Nikia’s Point & Find

https://www.youtube.com/watch?v=8SdwVCUJ0QE

https://en.wikipedia.org/wiki/Nokia_Point_&_Find
8
0
3D Urban Modeling and Virtual Tourism

[ Agarwal, Furukawa, Snavely, Curless, Seitz, Szeliski, 2010 ]


8
1
Visual Special Effects (VFX): Shape and Motion Capture

Slide Credit: Stephen Seitz (University of Washington)


8
2
Vision in Sports

Sportvision first down line


Nice explanation on www.howstuffworks.com
http://www.sportvision.com/video.html
Slide Credit: Stephen Seitz (University of Washington)
8
3
Automotive Safety and Smart Cars

Slide Credit: Amnon Shashua


8
4
Interactive Games: Kinect

8
5
Vision for Robotics, Space Exploration

NASA'S Mars Exploration Rover Spirit captured this westward view from
atop a low plateau where Spirit spent the closing months of 2007.
Vision systems (JPL) used for several tasks
•Panorama stitching
•3D terrain modeling
•Obstacle detection, position tracking
•For more, read “Computer Vision8 on Mars” by Matthies et al.
6
Vision for Medical Imaging

3D Image guided
imaging surgery
MRI, CT Grimson et al.,
Slide Credit: James Hays (GA Tech)
8 MIT
7
Captioning and Visual Question Answering

[ Vinyals et al., 2015 ]

Demo: http://vqa.cloudcv.org
Demo: http://demo.visualdialog.org [ Seo et al., NIPS 2017 ]
8
8
Related Disciplines
Artificial Intelligence (AI)
Robotics

Computer Vision
Machine Human Computer
Learning Scope of CPSC 425 Interaction

Graphics Image
Medical Imaging
Processing
Computational
Photography Geometric Neuroscience

Optics
Reasoning
Slide Credit: James Hays (GA Tech)
Recognition
8
9
Prepare for the Next Lecture

Readings:
— Next Lecture: Forsyth & Ponce (2nd ed.) 1.1.1 — 1.1.3
(optional – Secs. 1.1 and 1.2 from Szeliski)
Reminders:

— Start working on Assignment 0 (ungraded) "due“ Wed. Jan 19

Assignment 1 out Jan 17 due Jan 28 (tentative)

— [optional] Watch TED talk by Prof. Fei-Fei Li (18 minutes)


https://www.youtube.com/watch?v=40riCqvRoMs
9
0
STOP HERE
Related Disciplines: Vision and Graphics

Images Model

Vision

Graphics

Inverse problems: analysis and synthesis

(it is sometimes useful to think about computer vision as inverse graphics)

Slide Credit: Kristen Grauman (UT Austin)


9
2
Why Study Computer Vision?

It is one of the most exciting areas of research in computer science

Among the fastest growing technologies in the industry today

9
3
9
4
Wired’s 100 Most Influential People in the World

9
5
9
6
CVPR Attendance

9
7

You might also like