How To Use Google Colaboratory For Video Processing - CodeProject
How To Use Google Colaboratory For Video Processing - CodeProject
How to use this Google service and the free NVIDIA Tesla K80 GPU to achieve your own goals in training neural networks
Introduction
Did you know that a set of computer algorithms can process a video stream in a way that allows them to detect criminal activity,
control traffic jams, and even automatically detect events in sports broadcasts? Thanks to the application of machine learning (ML),
the idea of acquiring so much data from a simple video doesn’t seem that unrealistic. In this article, we want to share our experience
applying the pre-built logic of a machine learning algorithm for object detection and segmentation to a video.
In particular, we talk about how to configure Google Colaboratory for solving video processing tasks with machine learning. You’ll
learn how to use this Google service and the free NVIDIA Tesla K80 GPU that it provides for achieving your own goals in training
neural networks. This article will be useful for people who are getting familiar with machine learning and considering working with
image recognition and video processing.
Contents:
Image processing with limited hardware resources
Mask_RCNN sample
Conclusion
From a technical point of view, any video recording consists of a series of still images in a particular format that is compressed with
a video codec. Consequently, object recognition on a video stream comes down to splitting the stream into separate images, or
frames, and applying a pre-trained ML image recognition algorithm to them.
To do this, we decided to use a neural network from the Mask_R-CNN repository for classifying single images. The repository
contains an implementation of a convolutional neural network on Python 3, TensorFlow, and Keras. Let’s see what came out of this
plan.
Mask_RCNN sample
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 1/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
We developed and implemented a simple sample of Mask_RCNN that received a picture as the input and recognized objects in it.
We created a sample on the basis of the demo.ipynb description taken from the Mask_R-CNN repository. Here’s the code of our
sample:
class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
config = InferenceConfig()
config.display()
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 2/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
# Run detection
results = model.detect([image], verbose=1)
# Visualize results
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])
In this sample, /content/drive/My Drive/Colab Notebooks/MRCNN_pure is the path to our repository with Mask_R-CNN. As a result,
we got the following:
This part of the demo code looks through the images folder, randomly selects an image, and loads it to our neural network model
for classification:
# Run detection
results = model.detect([image], verbose=1)
Let’s modify the Mask_R-CNN sample to make it recognize all images in the images folder:
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 3/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
# Run detection
results = model.detect([image], verbose=1)
# Visualize results
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])
After running the demo code for five minutes, the console displayed the following output:
...
Processing 1 images
image shape: (415, 640, 3) min: 0.00000 max: 255.00000 uint8
molded_images shape: (1, 1024, 1024, 3) min: -123.70000 max: 151.10000 float64
image_metas shape: (1, 93) min: 0.00000 max: 1024.00000 float64
anchors shape: (1, 261888, 4) min: -0.35390 max: 1.29134 float32
Segmentation fault (core dumped)
Initially, we ran the demo code on a computer with an Intel Core i5 and 8GB of RAM without a discrete graphics card. The code
crashed each time in different places, but most often it crashed in the TensorFlow framework during memory allocation. Moreover,
any attempts to run any other software during the image recognition process slowed down the computer to the point of being
useless.
Thus, we faced a serious problem: any experiments in getting familiar with ML required a powerful graphics card and more
hardware resources. Without this, we couldn’t run any other tasks while recognizing a large number of images.
Before explaining how to work with this Google service, we’d like to underline other beneficial Colaboratory features.
support for Python 2.7 and Python 3.6 so you can improve your coding skills;
the ability to work with Jupyter notebook so you can create, edit, and share your .ipynb files;
the ability to connect to a Jupyter runtime using your local machine;
many pre-installed libraries including TensorFlow, Keras, and OpenCV, as well as the possibility to interact with your custom
libraries in Google Colaboratory;
upload functionality so you can add your trained model;
integration with GitHub so you can load public GitHub notebooks or save a copy of your Colab file to GitHub;
simple visualization with such popular libraries as matplotlib;
forms that can be used to parameterize code;
the ability to store Google Colab notebooks in your Google Drive.
In order to start using the Google Colab GPU, you just need to provide access to your .ipynb script that’s implemented in a Docker
container. The Docker container is assigned to you only for 12 hours. All scripts created by you are stored by default in your Google
Drive in the Colab Notebooks section, which is automatically created as you connect to Colaboratory. After the expiry of 12 hours,
all of your data in the container will be deleted. You can avoid this by mounting your Google Drive in your container and working
with it. Otherwise, the file system of the Docker image will be available only for a limited period of time.
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 4/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
Rename your notebook whatever you want by clicking on the file name. Now you need to choose your hardware. To do this, just go
to the Edit section, find Notebook Settings, select GPU as the Hardware accelerator, and save changes by clicking SAVE.
After the new settings are saved, a Docker container with a discrete graphics card will become available to you. You’ll be notified
about this with a “Connected” message in the upper right of the page:
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 5/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
Now you can mount your Google Drive to this container in order to relocate the source code and save the results of your work in
the container. To do this, just copy the code below in the first table cell and press the Play button (or Shift + Enter).
You’ll get a request for authorization. Click the link, authorize, copy the verification code, paste it in the text box in your .ipynb script,
and press Enter. If authorization is successful, your Google Drive will be mounted under the path /content/drive/My Drive. To follow
the file tree, select Files in the left-hand menu.
Now you have a Docker container with the Tesla K80 GPU, your Google Drive as file storage, and the .ipynb notebook for script
execution.
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 6/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
Then we add our sample code to the .ipynb script. When you do this, don’t forget to change your path to the Mask_RCNN folder
like this:
If you do everything right, the results of code execution will provide you with an image where all objects are detected and
recognized.
You can also modify the sample code to make it process all of the test images:
class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
config = InferenceConfig()
config.display()
# Run detection
results = model.detect([image], verbose=1)
# Visualize results
r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])
Using object detection in Google Colab, we received the results with recognized objects quickly, while our computer continued to
perform as usual even during the image recognition process.
We don’t need the whole code that read and implemented the recognition model on one image. So instead of opening the video
file, we run the video stream and move its pointer to the 1,000th frame as there are no objects to recognize in the intro of the
recording.
import cv2
...
VIDEO_STREAM = "/content/drive/My Drive/Colab Notebooks/Millery.avi"
VIDEO_STREAM_OUT = "/content/drive/My Drive/Colab Notebooks/Result.avi"
...
# initialize the video stream and pointer to output video file
vs = cv2.VideoCapture(VIDEO_STREAM)
writer = None
vs.set(cv2.CAP_PROP_POS_FRAMES, 1000);
Then we process 20,000 frames with our neural network model. The OpenCV object allows us to get images by frame from the
video file using the read() method. The received image is passed to the model.detect() method and the results are visualized with
the visualize.display_instances() function.
However, we faced a problem: the display_instances() function from the Mask_RCNN repository reflects detected objects in the
image, but the image doesn’t return. We decided to simplify the display_instances()function and make it return the image with
displayed objects:
if not n_instances:
print('NO INSTANCES TO DISPLAY')
else:
assert boxes.shape[0] == masks.shape[-1] == ids.shape[0]
return image
After processing, the frames should be bound back together into a new video file. We can also do this with the OpenCV library. All
we need to do is allocate the VideoWriter object from the OpenCV library:
fourcc = cv2.VideoWriter_fourcc(*"XVID")
writer = cv2.VideoWriter(VIDEO_STREAM_OUT, fourcc, 30, (masked_frame.shape[1], masked_frame.shape[0]),
True)
using the type of video we provided for the input. We get the video file type with the help of the ffprobecommand:
ffprobe Result.avi
...
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0: Video: mpeg4 (Simple Profile) (XVID / 0x44495658), yuv420p, 640x272 [SAR 1:1 DAR
40:17], 30 fps, 30 tbr, 30 tbn, 30 tbc
At the beginning of the script, we need to specify paths to the target video files for processing: VIDEO_STREAM and
VIDEO_STREAM_OUT.
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 9/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
if not os.path.exists(COCO_MODEL_PATH):
utils.download_trained_weights(COCO_MODEL_PATH)
class InferenceConfig(coco.CocoConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
if not n_instances:
print('NO INSTANCES TO DISPLAY')
else:
assert boxes.shape[0] == masks.shape[-1] == ids.shape[0]
return image
config = InferenceConfig()
config.display()
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 10/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
writer = None
vs.set(cv2.CAP_PROP_POS_FRAMES, 1000);
i = 0
while i < 20000:
# read the next frame from the file
(grabbed, frame) = vs.read()
i += 1
# If the frame was not grabbed, then we have reached the end
# of the stream
if not grabbed:
print ("Not grabbed.")
break;
# Run detection
results = model.detect([frame], verbose=1)
# Visualize results
r = results[0]
masked_frame = display_instances(frame, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])
After successful execution of the script, our video file with recognized images will be located in the path specified in
VIDEO_STREAM_OUT. We run our system using a piece of a movie and receive a video file with recognized objects. Check out it
here.
Conclusion
In this article, we’ve shown you how we took advantage of Google Colab and explained how you can do the following:
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)
ApriorIT is a software research and development company specializing in cybersecurity and data management technology
engineering. We work for a broad range of clients from Fortune 500 technology leaders to small innovative startups building
unique solutions.
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 11/12
08/03/2019 How to Use Google Colaboratory for Video Processing - CodeProject
As Apriorit offers integrated research&development services for the software projects in such areas as endpoint security, network
security, data security, embedded Systems, and virtualization, we have strong kernel and driver development skills, huge system
programming expertise, and are reals fans of research projects.
Our specialty is reverse engineering, we apply it for security testing and security-related projects.
A separate department of Apriorit works on large-scale business SaaS solutions, handling tasks from business analysis, data
architecture design, and web development to performance optimization and DevOps.
33 members
Semyon Boyko
No Biography provided
Team Leader Apriorit
Ukraine
Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile Article Copyright 2019 by Apriorit Inc, Semyon Boyko
Web04 | 2.8.190306.1 | Last Updated 7 Mar 2019 Everything else Copyright © CodeProject, 1999-2019
https://www.codeproject.com/Articles/1279206/How-to-Use-Google-Colaboratory-for-Video-Processin?display=Print 12/12