Mastering OpenCV 3 - Second Edition
()
About this ebook
- Updated for OpenCV 3, this book covers new features that will help you unlock the full potential of OpenCV 3
- Written by a team of 7 experts, each chapter explores a new aspect of OpenCV to help you make amazing computer-vision aware applications
- Project-based approach with each chapter being a complete tutorial, showing you how to apply OpenCV to solve complete problems
This book is for those who have a basic knowledge of OpenCV and are competent C++ programmers. You need to have an understanding of some of the more theoretical/mathematical concepts, as we move quite quickly throughout the book.
Daniel Lélis Baggio
Daniel Lelis Baggio started his work in computer vision through medical image processing at InCor (Instituto do Coracao - Heart Institute) in Sao Paulo, where he worked with intra-vascular ultrasound image segmentation. Since then, he has focused on GPGPU and ported the segmentation algorithm to work with NVIDIA's CUDA. He has also dived into six degrees of freedom head tracking with a natural user interface group through a project called ehci (http://code.google.com/p/ehci/). He now works for the Brazilian Air Force
Read more from Daniel Lélis Baggio
Mastering OpenCV with Practical Computer Vision Projects Rating: 0 out of 5 stars0 ratingsOpenCV 3.0 Computer Vision with Java Rating: 0 out of 5 stars0 ratings
Related to Mastering OpenCV 3 - Second Edition
Related ebooks
Android Application Development with Maven Rating: 0 out of 5 stars0 ratingsNW.js Essentials Rating: 0 out of 5 stars0 ratingsOpenCV Android Programming By Example Rating: 0 out of 5 stars0 ratingsOpa Application Development Rating: 0 out of 5 stars0 ratingsOpenCV for Secret Agents Rating: 0 out of 5 stars0 ratingsAndroid Application Programming with OpenCV Rating: 3 out of 5 stars3/5Beginning Mobile Application Development in the Cloud Rating: 0 out of 5 stars0 ratingsAsynchronous Android Rating: 4 out of 5 stars4/5Mastering AndEngine Game Development Rating: 0 out of 5 stars0 ratingsMastering OpenCV Android Application Programming Rating: 0 out of 5 stars0 ratingsAndroid Studio Cookbook Rating: 4 out of 5 stars4/5Learn iOS Application Development: Take Your Mobile App Development Skills to the Next Level with Swift and Xcode (English Edition) Rating: 0 out of 5 stars0 ratingsOpenCV Essentials Rating: 0 out of 5 stars0 ratingsXamarin Mobile Application Development for Android - Second Edition Rating: 0 out of 5 stars0 ratingsPractical Django 2 and Channels 2: Building Projects and Applications with Real-Time Capabilities Rating: 0 out of 5 stars0 ratingsAndroid NDK: Beginner's Guide - Second Edition Rating: 0 out of 5 stars0 ratingsJetpack Compose 1.5 Essentials: Developing Android Apps with Jetpack Compose 1.5, Android Studio, and Kotlin Rating: 0 out of 5 stars0 ratingsEnterprise OSGi In Action Rating: 0 out of 5 stars0 ratingsMastering Android NDK Rating: 0 out of 5 stars0 ratingsDirect3D Rendering Cookbook Rating: 0 out of 5 stars0 ratingsOpenCart Theme and Module Development Rating: 0 out of 5 stars0 ratingsPHP Ajax Cookbook Rating: 2 out of 5 stars2/5Nginx Troubleshooting Rating: 0 out of 5 stars0 ratingsRust for the IoT: Building Internet of Things Apps with Rust and Raspberry Pi Rating: 0 out of 5 stars0 ratingsInstant MinGW Starter Rating: 0 out of 5 stars0 ratingsGetting Started with NativeScript Rating: 0 out of 5 stars0 ratingsIntelliJ IDEA Essentials Rating: 0 out of 5 stars0 ratingsASP.NET 4.0 in Practice Rating: 0 out of 5 stars0 ratingsPhantomJS Cookbook Rating: 0 out of 5 stars0 ratingsRaspberry Pi 2 Server Essentials Rating: 0 out of 5 stars0 ratings
Computers For You
The Invisible Rainbow: A History of Electricity and Life Rating: 5 out of 5 stars5/5How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally Rating: 4 out of 5 stars4/5Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad Rating: 0 out of 5 stars0 ratingsUncanny Valley: A Memoir Rating: 4 out of 5 stars4/5The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution Rating: 4 out of 5 stars4/5Elon Musk Rating: 4 out of 5 stars4/5Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics Rating: 4 out of 5 stars4/5Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsCompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide Rating: 5 out of 5 stars5/5The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology Rating: 4 out of 5 stars4/5Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls Rating: 4 out of 5 stars4/5A Slackers Guide to Coding with Python: Ultimate Beginners Guide to Learning Python Quick Rating: 0 out of 5 stars0 ratings101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters Rating: 4 out of 5 stars4/5The Hacker Crackdown: Law and Disorder on the Electronic Frontier Rating: 4 out of 5 stars4/5Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition Rating: 4 out of 5 stars4/5Dark Aeon: Transhumanism and the War Against Humanity Rating: 5 out of 5 stars5/5Deep Search: How to Explore the Internet More Effectively Rating: 5 out of 5 stars5/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Mastering ChatGPT: 21 Prompts Templates for Effortless Writing Rating: 4 out of 5 stars4/5CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61 Rating: 0 out of 5 stars0 ratingsGrokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5An Ultimate Guide to Kali Linux for Beginners Rating: 3 out of 5 stars3/5Learning the Chess Openings Rating: 5 out of 5 stars5/5What Video Games Have to Teach Us About Learning and Literacy. Second Edition Rating: 4 out of 5 stars4/5
Reviews for Mastering OpenCV 3 - Second Edition
0 ratings0 reviews
Book preview
Mastering OpenCV 3 - Second Edition - Daniel Lélis Baggio
Title Page
Mastering OpenCV 3
Second Edition
Get hands-on with practical Computer Vision using OpenCV 3
Daniel Lélis Baggio
Shervin Emami
David Millán Escrivá
Khvedchenia Ievgen
Jason Saragih
Roy Shilkrot
BIRMINGHAM - MUMBAI
Copyright
Mastering OpenCV 3
Second Edition
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: December 2012
Second edition: April 2017
Production reference: 1260417
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-78646-717-1
www.packtpub.com
Credits
About the Authors
Daniel Lélis Baggio started his work in computer vision through medical image processing at InCor (Instituto do Coração – Heart Institute) in São Paulo, where he worked with intravascular ultrasound image segmentation. Since then, he has focused on GPGPU and ported the segmentation algorithm to work with NVIDIA's CUDA. He has also dived into 6degrees of freedom head tracking with a natural user interface group through a project called ehci (http://code.google.com/p/ehci). He now works for the Brazilian Air Force.
Shervin Emami, born in Iran, taught himself electronics and hobby robotics during his early teens in Australia. While building his first robot at the age of 15, he learned how RAM and CPUs work. He was so amazed by the concept that he soon designed and built a whole Z80 motherboard to control his robot, and wrote all the software purely in binary machine code using two push buttons for 0s and 1s.
After learning that computers can be programmed in much easier ways such as assembly language and even high-level compilers, Shervin became hooked on computer programming and has been programming desktops, robots, and smartphones nearly every day since then. During his late teens, he created Draw3D (http://draw3d.shervinemami.info/), a 3D modeler with 30,000 lines of optimized C and assembly code that rendered 3D graphics faster than all the commercial alternatives of the time, but he lost interest in graphics programming when 3D hardware acceleration became available.
In University, Shervin took a class on Computer Vision and became greatly interested in it. So, for his first thesis in 2003, he created a real-time face detection program based on Eigenfaces, using OpenCV (beta 3) for the camera input. For his master's thesis in 2005, he created a visual navigation system for several mobile robots using OpenCV (v0.96).
From 2008, he worked as a freelance Computer Vision Developer in Abu Dhabi and Philippines, using OpenCV for a large number of short-term commercial projects that included:
Detecting faces using Haar or Eigenfaces
Recognizing faces using Neural Networks, EHMM, or Eigenfaces
Detecting the 3D position and orientation of a face from a single photo using AAM and POSIT
Rotating a face in 3D using only a single photo
Face preprocessing and artificial lighting using any 3D direction from a single photo
Gender recognition
Facial expression recognition
Skin detection
Iris detection
Pupil detection
Eye-gaze tracking
Visual-saliency tracking
Histogram matching
Body-size detection
Shirt and bikini detection
Money recognition
Video stabilization
Face recognition on iPhone
Food recognition on iPhone
Marker-based augmented reality on iPhone (the second-fastest iPhone augmented reality app at the time)
OpenCV was putting food on the table for Shervin's family, so he began giving back to OpenCV through regular advice on the forums and by posting free OpenCV tutorials on his website (http://www.shervinemami.info/openCV.html). In 2011, he contacted the owners of other free OpenCV websites to write this book. He also began working on computer vision optimization for mobile devices at NVIDIA, working closely with the official OpenCV developers to produce an optimized version of OpenCV for Android. In 2012, he also joined the Khronos OpenVL committee for standardizing the hardware acceleration of computer vision for mobile devices, on which OpenCV will be based in the future.
David Millán Escrivá was 8 years old when he wrote his first program on an 8086 PC with basic language, which enabled the 2D plotting of basic equations. In 2005, he finished his studies in IT through the Universitat Politécnica de Valencia with honors in human-computer interaction supported by computer vision with OpenCV (v0.96). He had a final project based on this subject and published it on HCI Spanish congress. He participated in Blender, an open source, 3D-software project, and worked on his first commercial movie Plumiferos—Aventuras voladorasas, as a computer graphics software developer.
David now has more than 10 years of experience in IT, with experience in computer vision, computer graphics, and pattern recognition, working on different projects and start-ups, applying his knowledge of computer vision, optical character recognition, and augmented reality. He is the author of the DamilesBlog (h t t p ://b l o g . d a m i l e s . c o m), where he publishes research articles and tutorials about OpenCV, Computer Vision in general, and Optical Character Recognition algorithms. David has reviewed the book gnuPlot Cookbook, Packt Publishing, written by Lee Phillips.
Khvedchenia Ievgen is a Computer Vision expert from Ukraine. He started his career with research and development of a camera-based driver assistance system for Harman International. He then began working as a computer vision consultant for ESG. Nowadays, he is a self-employed developer focusing on the development of augmented reality applications. Ievgen is the author of the Computer Vision Talks blog (http://computer-vision-talks.com),where he publishes research articles and tutorials pertaining to computer vision and augmented reality.
Jason Saragih received his BE in mechatronics (with honors) and PhD in computer science from the Australian National University, Canberra, Australia, in 2004 and 2008, respectively. From 2008 to 2010, he was a Postdoctoral fellow at the Robotics Institute of Carnegie Mellon University, Pittsburgh, PA. From 2010 to 2012, he worked at the Commonwealth Scientific and Industrial Research Organization (CSIRO) as a research scientist. He is currently a senior research scientist at Visual Features, an Australian tech start-up company.
Dr. Saragih has made a number of contributions to the field of computer vision, specifically on the topic of deformable model registration and modeling. He is the author of two nonprofit open source libraries that are widely used in the scientific community; DeMoLib and FaceTracker, both of which make use of generic computer vision libraries, including OpenCV.
Roy Shilkrot is a researcher and professional in the area of computer vision and computer graphics. He obtained a BSc in computer science from Tel-Aviv-Yaffo Academic College, and an MSc from Tel-Aviv University. He is currently a PhD candidate in Media Laboratory of the Massachusetts Institute of Technology (MIT) in Cambridge.
Roy has over seven years of experience as a software engineer in start-up companies and enterprises. Before joining the MIT Media Lab as a research assistant, he worked as a technology strategist in the Innovation Laboratory of Comverse, a telecom solutions provider. He also dabbled in consultancy, and worked as an intern for Microsoft research at Redmond.
About the Reviewer
Vinícius Godoy is a professor at PUCPR and the owner of the game development website called Ponto V!. He has a Master’degree in Computer Vision and Image Processing (PUCPR), a specialization degree in game development (Universidade Positivo) and graduation in Technology in Informatics - Networking (UFPR). He is also one of the authors of the book OpenCV by Example, Packt Publishing and is currently working on his Doctoral thesis on medical imaging in PUCPR.
He is in the software development field for more than 20 years. His former professional experience includes the design and programming of a multithreaded framework for PBX tests at Siemens, coordination of Aurelio Dictionary Software 100 years edition project—including its mobile versions for Android, IOS, and Windows Phone—coordination of an augmented reality educational activity for Positivo's educational table Mesa Alfabeto, presented at CEBIT and the IT Management of a BPMS company called Sinax.
www.PacktPub.com
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.comand as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
h t t p s ://w w w . p a c k t p u b . c o m /m a p t
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Customer Feedback
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at h t t p s ://w w w . a m a z o n . c o m /d p /1786467178.
If you'd like to join our team of regular reviewers, you can e-mail us at customerreviews@packtpub.com. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Table of Contents
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Cartoonifier and Skin Changer for Raspberry Pi
Accessing the webcam
Main camera processing loop for a desktop app
Generating a black and white sketch
Generating a color painting and a cartoon
Generating an evil mode using edge filters
Generating an alien mode using skin detection
Skin detection algorithm
Showing the user where to put their face
Implementation of the skin color changer
Summary
Exploring Structure from Motion Using OpenCV
Structure from Motion concepts
Estimating the camera motion from a pair of images
Point matching using rich feature descriptors
Finding camera matrices
Choosing the image pair to use first
Reconstructing the scene
Reconstruction from many views
Refinement of the reconstruction
Using the example code
Summary
References
Number Plate Recognition using SVM and Neural Network
Introduction to ANPR
ANPR algorithm
Plate detection
Segmentation
Classification
Plate recognition
OCR segmentation
Feature extraction
OCR classification
Evaluation
Summary
Non-Rigid Face Tracking
Overview
Utilities
Object-oriented design
Data collection - image and video annotation
Training data types
Annotation tool
Pre-annotated data (the MUCT dataset)
Geometrical constraints
Procrustes analysis
Linear shape models
A combined local-global representation
Training and visualization
Facial feature detectors
Correlation-based patch models
Learning discriminative patch models
Generative versus discriminative patch models
Accounting for global geometric transformations
Training and visualization
Face detection and initialization
Face tracking
Face tracker implementation
Training and visualization
Generic versus person-specific models
Summary
References
3D Head Pose Estimation Using AAM and POSIT
Active Appearance Models overview
Active Shape Models
Getting the feel of PCA
Triangulation
Triangle texture warping
Model Instantiation - playing with the AAM
AAM search and fitting
POSIT
Diving into POSIT
POSIT and head model
Tracking from webcam or video file
Summary
References
Face Recognition Using Eigenfaces or Fisherfaces
Introduction to face recognition and face detection
Step 1 - face detection
Implementing face detection using OpenCV
Loading a Haar or LBP detector for object or face detection
Accessing the webcam
Detecting an object using the Haar or LBP Classifier
Grayscale color conversion
Shrinking the camera image
Histogram equalization
Detecting the face
Step 2 - face preprocessing
Eye detection
Eye search regions
Geometrical transformation
Separate histogram equalization for left and right sides
Smoothing
Elliptical mask
Step 3 - Collecting faces and learning from them
Collecting preprocessed faces for training
Training the face recognition system from collected faces
Viewing the learned knowledge
Average face
Eigenvalues, Eigenfaces, and Fisherfaces
Step 4 - face recognition
Face identification - recognizing people from their face
Face verification - validating that it is the claimed person
Finishing touches - saving and loading files
Finishing touches - making a nice and interactive GUI
Drawing the GUI elements
Startup mode
Detection mode
Collection mode
Training mode
Recognition mode
Checking and handling mouse clicks
Summary
References
Preface
Mastering OpenCV3, Second Edition contains seven chapters, where each chapter is a tutorial for an entire project from start to finish, based on OpenCV's C++ interface, including the full source code. The author of each chapter was chosen for their well-regarded online contributions to the OpenCV community on that topic, and the book was reviewed by one of the main OpenCV developers. Rather than explaining the basics of OpenCV functions, this book shows how to apply OpenCV to solve whole problems, including several 3D camera projects (augmented reality, and 3D structure from Motion) and several facial analysis projects (such as skin detection, simple face and eye detection, complex facial feature tracking, 3D head orientation estimation, and face recognition), therefore it makes a great companion to the existing OpenCV books.
What this book covers
Chapter 1, Cartoonifier and Skin Changer for Raspberry Pi, contains a complete tutorial and source code for both a desktop application and a Raspberry Pi that automatically generates a cartoon or painting from a real camera image, with several possible types of cartoons, including a skin color changer.
Chapter 2, Exploring Structure from Motion Using OpenCV, contains an introduction to Structure from Motion (SfM) via an implementation of SfM concepts in OpenCV. The reader will learn how to reconstruct 3D geometry from multiple 2D images and estimate camera positions.
Chapter 3, Number Plate Recognition Using SVM and Neural Networks, includes a complete tutorial and source code to build an automatic number plate recognition application using pattern recognition algorithms and also using a support vector machine and Artificial Neural Networks. The reader will learn how to train and predict pattern-recognition algorithms to decide whether an image is a number plate or not. It will also help classify a set of features into a character.
Chapter 4, Non-Rigid Face Tracking, contains a complete tutorial and source code to build a dynamic face tracking system that can model and track the many complex parts of a person's face.
Chapter 5, 3D Head Pose Estimation Using AAM and POSIT, includes all the background required to understand what Active Appearance Models (AAMs) are and how to create them with OpenCV using a set of face frames with different facial expressions. Besides, this chapter explains how to match a given frame through fitting capabilities offered by AAMs. Then, by applying the POSIT algorithm, one can find the 3D head pose.
Chapter 6, Face Recognition Using Eigenfaces or Fisherfaces, contains a complete tutorial and source code for a real-time face-recognition application that includes basic face and eye detection to handle the rotation of faces and varying lighting conditions in the images.
Chapter 7, Natural Feature Tracking for Augmented Reality, includes a complete tutorial on how to build a marker-based Augmented Reality (AR) application for iPad and iPhone devices with an explanation of each step and source code. It also contains a complete tutorial on how to develop a marker-less augmented reality desktop application with an explanation of what marker-less AR is and the source code.
You can download this chapter from: h t t p s ://w w w . p a c k t p u b . c o m /s i t e s /d e f a u l t /f i l e s /d o w n l o a d s /N a t u r a l F e a t u r e T r a c k i n g f o r A u g m e n t e d R e a l i t y . p d f.
What you need for this book
You don't need to have special knowledge in computer vision to read this book, but you should have good C/C++ programming skills and basic experience with OpenCV before reading this book. Readers without experience in OpenCV may wish to read the book Learning OpenCV for an introduction to the OpenCV features, or read OpenCV 2 Cookbook for examples on how to use OpenCV with recommended C/C++ patterns, because this book will show you how to solve real problems, assuming you are already familiar with the basics of OpenCV and C/C++ development.
In addition to C/C++ and OpenCV experience, you will also need a computer, and IDE of your choice (such as Visual Studio, XCode, Eclipse, or QtCreator, running on Windows, Mac, or Linux). Some chapters have further requirements, in particular:
To develop an OpenCV program for Raspberry Pi, you will need the Raspberry Pi device, its tools, and basic Raspberry Pi development experience.
To develop an iOS app, you will need an iPhone, iPad, or iPod Touch device, iOS development tools (including an Apple computer, XCode IDE, and an Apple Developer Certificate), and basic iOS and Objective-C development experience.
Several desktop projects require a webcam connected to your computer. Any common USB webcam should suffice, but a webcam of at least 1 megapixel may be desirable.
CMake is used in some projects, including OpenCV itself, to build across operating systems and compilers. A basic understanding of build systems is required, and knowledge of cross-platform building is recommended.
An understanding of linear algebra is expected, such as basic vector and matrix operations, and eigen decomposition.
Who this book is for
Mastering OpenCV 3, Second Edition is the perfect book for developers with basic OpenCV knowledge to use to create practical computer vision projects, as well as for seasoned OpenCV experts who want to add more computer vision topics to their skill set. It is aimed at senior computer science university students, graduates, researchers, and computer vision experts who wish to solve real problems using the OpenCV C++ interface, through practical step-by-step tutorials.
Conventions
In this book, you will find a number of text styles that distinguish between