Nvidia - Ug - Matlab Gpu Coder
Nvidia - Ug - Matlab Gpu Coder
NVIDIA® GPUs
User’s Guide
R2019b
How to Contact MathWorks
Phone: 508-647-7000
Deployment
2
Build and Run an Executable on NVIDIA Hardware . . . . . . . . . 2-2
Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Tutorial Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Example: Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Create a Live Hardware Connection Object . . . . . . . . . . . . . . . 2-3
Generate CUDA Executable Using GPU Coder . . . . . . . . . . . . 2-4
Run the Executable and Verify the Results . . . . . . . . . . . . . . . 2-7
iii
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
Create a Live Hardware Connection Object . . . . . . . . . . . . . 2-17
The videoReaderDeploy Entry-Point Function . . . . . . . . . . . . 2-18
Generate CUDA Executable Using GPU Coder . . . . . . . . . . . 2-19
Run the Executable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
Specifying the Video File at Runtime . . . . . . . . . . . . . . . . . . 2-21
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23
Verification
3
Processor-In-The-Loop Execution from Command Line . . . . . . 3-2
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Example: The Mandelbrot Set . . . . . . . . . . . . . . . . . . . . . . . . . 3-3
Create a Live Hardware Connection Object . . . . . . . . . . . . . . . 3-5
Configure the PIL Execution . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Generate Code and Run PIL Execution . . . . . . . . . . . . . . . . . . 3-7
Terminate the PIL Execution Process. . . . . . . . . . . . . . . . . . . . 3-7
iv Contents
1
The NVIDIA DRIVE and Jetson hardware is also referred to as a board or as target
hardware.
1 On the MATLAB Home tab, in the Environment section, select Add-Ons > Get
Hardware Support Packages.
1-2
See Also
2 In the Add-On Explorer window, click the support package and then click Install.
On the MATLAB Home tab, in the Environment section, select Help > Check for
Updates.
1 On the MATLAB Home tab, in the Environment section, click Add-Ons > Manage
Add-Ons.
2 In the Add-On Manager window, find and click the support package, and then click
Uninstall.
See Also
More About
• “Install and Setup Prerequisites for NVIDIA Boards” on page 1-4
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
1-3
1 Installation and Setup
Target Requirements
Hardware
GPU Coder Support Package for NVIDIA GPUs supports the following development
platforms:
The GPU Coder Support Package for NVIDIA GPUs uses an SSH connection over TCP/IP
to execute commands while building and running the generated CUDA® code on the
DRIVE or Jetson platforms. Connect the target platform to the same network as the host
computer. Alternatively, you can use an Ethernet crossover cable to connect the board
directly to the host computer.
Software
• Use the JetPack or the DriveInstall software to install the OS image, developer tools,
and the libraries required for developing applications on the Jetson or DRIVE
platforms. You can use the Component Manager in the JetPack or the
DriveInstall software to select the components to be installed on the target
hardware. For installation instructions, refer to the NVIDIA board documentation. At a
minimum, you must install:
• CUDA toolkit.
• cuDNN library.
• TensorRT library.
• OpenCV library.
• GStreamer library (v1.0 or higher) for deployment of the videoReader function.
The GPU Coder Support Package for NVIDIA GPUs has been tested with the following
JetPack and DRIVE SDK versions:
1-4
Install and Setup Prerequisites for NVIDIA Boards
For example, on Ubuntu, use the apt-get command to install these libraries.
GPU Coder Support Package for NVIDIA GPUs uses environment variables to locate the
necessary tools, compilers, and libraries required for code generation. Ensure that the
following environment variables are set.
Ensure that the required environment variables are accessible from non-interactive SSH
logins. For example, you can use the export command at the beginning of the
$HOME/.bashrc shell config file to add the environment variables.
1-5
1 Installation and Setup
# don't put duplicate lines or lines starting with space in the history.
# See bash(1) for more options
HISTCONTROL=ignoreboth
Input Devices
MathWorks Products
• MATLAB (required).
• MATLAB Coder (required).
• Parallel Computing Toolbox (required).
• Deep Learning Toolbox™ (required for deep learning).
• Image Processing Toolbox™ (recommended).
• Embedded Coder® (recommended).
• Simulink® (recommended).
1-6
See Also
Third-Party Products
For information on the version numbers for the compiler tools and libraries, see
“Installing Prerequisite Products” (GPU Coder). For information on setting up the
environment variables on the host development computer, see “Setting Up the
Prerequisite Products” (GPU Coder).
Note It is recommended to use the same versions of cuDNN and TensorRT libraries on
the target board and the host computer.
See Also
More About
• “Install Support Package for NVIDIA Hardware” on page 1-2
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
1-7
2
Deployment
The GPU Coder Support Package for NVIDIA GPUs uses the GPU Coder product to
generate CUDA code (kernels) from the MATLAB algorithm. These kernels run on any
CUDA enabled GPU platform. The support package automates the deployment of the
generated CUDA code on GPU hardware platforms such as Jetson or DRIVE
Learning Objectives
In this tutorial, you learn how to:
• Prepare your MATLAB code for CUDA code generation by using the kernelfun
pragma.
• Connect to the NVIDIA target board.
• Generate and deploy CUDA executable on the target board.
• Run the executable on the board and verify the results.
Tutorial Prerequisites
Target Board Requirements
2-2
Build and Run an Executable on NVIDIA Hardware
• GPU Coder for code generation. For an overview and tutorials, see the “Getting
Started with GPU Coder” (GPU Coder) page
• NVIDIA CUDA toolkit on the host.
• Environment variables on the host for the compilers and libraries. For information on
the supported versions of the compilers and libraries, see “Third-party Products” (GPU
Coder). For setting up the environment variables, see “Environment Variables” (GPU
Coder).
To communicate with the NVIDIA hardware, you must create a live hardware connection
object by using the jetson or drive function. To create a live hardware connection
object, provide the host name or IP address, user name, and password of the target
board. For example to create live object for Jetson hardware:
hwobj = jetson('192.168.1.15','ubuntu','ubuntu');
2-3
2 Deployment
The software performs a check of the hardware, compiler tools and libraries, IO server
installation, and gathers peripheral information on target. This information is displayed in
the command window.
Checking for CUDA availability on the Target...
Checking for NVCC in the target system path...
Checking for CUDNN library availability on the Target...
Checking for TensorRT library availability on the Target...
Checking for Prerequisite libraries is now complete.
Fetching hardware details...
Fetching hardware details is now complete. Displaying details.
Board name : NVIDIA Jetson TX2
CUDA Version : 9.0
cuDNN Version : 7.0
TensorRT Version : 3.0
Available Webcams : UVC Camera (046d:0809)
Available GPUs : NVIDIA Tegra X2
// Function Declarations
2-4
Build and Run an Executable on NVIDIA Hardware
// Function Definitions
static void argInit_1x100_real_T(real_T result[100])
{
int32_T idx1;
argInit_1x100_real_T(b);
argInit_1x100_real_T(c);
myAdd(b, c, out);
writeToFile(out); // Write the output to a binary file
}
// Main routine
int32_T main(int32_T, const char * const [])
{
// Initialize the application.
myAdd_initialize();
2-5
2 Deployment
//main.h
#ifndef MAIN_H
#define MAIN_H
// Include Files
#include <stddef.h>
#include <stdlib.h>
#include "rtwtypes.h"
#include "myAdd_types.h"
// Function Declarations
extern int32_T main(int32_T argc, const char * const argv[]);
#endif
Create a GPU code configuration object for generating an executable. Use the
coder.hardware function to create a configuration object for the DRIVE or Jetson
platform and assign it to the Hardware property of the code configuration object cfg.
Use the BuildDir property to specify the folder for performing remote build process on
the target. If the specified build folder does not exist on the target, then the software
creates a folder with the given name. If no value is assigned to
cfg.Hardware.BuildDir, the remote build process happens in the last specified build
folder. If there is no stored build folder value, the build process takes place in the home
folder.
cfg = coder.gpuConfig('exe');
cfg.Hardware = coder.hardware('NVIDIA Jetson');
cfg.Hardware.BuildDir = '~/remoteBuildDir';
cfg.CustomSource = fullfile('main.cu');
To generate CUDA code, use the codegen command and pass the GPU code configuration
object along with the size of the inputs for and myAdd entry-point function. After the code
generation takes place on the host, the generated files are copied over and built on the
target.
codegen('-config ',cfg,'myAdd','-args',{1:100,1:100});
2-6
See Also
pid = runApplication(hwobj,'myAdd');
### Launching the executable on the target...
Executable launched successfully with process ID 26432.
Displaying the simple runtime log for the executable...
Copy the output bin file myAdd.bin to the MATLAB environment on the host and
compare the computed results with the results from MATLAB.
See Also
drive | drive | jetson | jetson | killApplication | killProcess | openShell |
runApplication | runExecutable | system
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware Using GPU Coder App” on page
2-9
2-7
2 Deployment
2-8
Build and Run an Executable on NVIDIA Hardware Using GPU Coder App
The GPU Coder Support Package for NVIDIA GPUs uses the GPU Coder product to
generate CUDA code (kernels) from the MATLAB algorithm. These kernels run on any
CUDA enabled GPU platform. The support package automates the deployment of the
generated CUDA code on GPU hardware platforms such as Jetson or DRIVE
Learning Objectives
In this tutorial, you learn how to:
• Prepare your MATLAB code for CUDA code generation by using the kernelfun
pragma.
• Create and set up a GPU Coder project.
• Change settings to connect to the NVIDIA target board.
• Generate and deploy CUDA executable on the target board.
• Run the executable on the board and verify the results.
2-9
2 Deployment
Tutorial Prerequisites
Target Board Requirements
• GPU Coder for code generation. For an overview and tutorials, see the “Getting
Started with GPU Coder” (GPU Coder) page
• NVIDIA CUDA toolkit on the host.
• Environment variables on the host for the compilers and libraries. For information on
the supported versions of the compilers and libraries, see “Third-party Products” (GPU
Coder). For setting up the environment variables, see “Environment Variables” (GPU
Coder).
2-10
Build and Run an Executable on NVIDIA Hardware Using GPU Coder App
//main.cu
// Include Files
#include "myAdd.h"
#include "main.h"
#include "myAdd_terminate.h"
#include "myAdd_initialize.h"
#include <stdio.h>
// Function Declarations
static void argInit_1x100_real_T(real_T result[100]);
static void main_myAdd();
// Function Definitions
static void argInit_1x100_real_T(real_T result[100])
{
int32_T idx1;
2-11
2 Deployment
argInit_1x100_real_T(b);
argInit_1x100_real_T(c);
myAdd(b, c, out);
writeToFile(out); // Write the output to a binary file
}
// Main routine
int32_T main(int32_T, const char * const [])
{
// Initialize the application.
myAdd_initialize();
//main.h
#ifndef MAIN_H
#define MAIN_H
// Include Files
#include <stddef.h>
#include <stdlib.h>
#include "rtwtypes.h"
#include "myAdd_types.h"
// Function Declarations
extern int32_T main(int32_T argc, const char * const argv[]);
#endif
1 The app opens the Select source files page. Select myAdd.m as the entry-point
function. Click Next.
2-12
Build and Run an Executable on NVIDIA Hardware Using GPU Coder App
5 Click More Settings, on the Custom Code panel, enter the custom main file
main.cu in the field for Additional source files. The custom main file and the
header file must be in the same location as the entry-point file.
2-13
2 Deployment
6 Under the Hardware panel, enter the device address, user name, password, and
build folder for the board.
2-14
Build and Run an Executable on NVIDIA Hardware Using GPU Coder App
7 Close the Settings window and click Generate. The software generates CUDA code
and deploys the executable to the folder specified. Click Next and close the app.
hwobj = jetson;
pid = runApplication(hwobj,'myAdd');
2-15
2 Deployment
Copy the output bin file myAdd.bin to the MATLAB environment on the host and
compare the computed results with the results from MATLAB.
outputFile = [hwobj.workspaceDir '/myAdd.bin']
getFile(hwobj,outputFile);
Maximum deviation between MATLAB Simulation output and GPU coder output on Target is: 0.000000
See Also
drive | drive | jetson | jetson | killApplication | killProcess | openShell |
runApplication | runExecutable | system
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
2-16
Read Video Files on NVIDIA Hardware
Requirements
1 GPU Coder.
2 GPU Coder Support Package for NVIDIA GPUs.
3 Image Processing Toolbox toolbox for the rhinos.avi sample video file used in this
example.
4 NVIDIA CUDA toolkit.
5 GStreamer and SDL libraries on the target.
6 Environment variables for the compilers and libraries on the host and the target. For
more information, see “Third-party Products” (GPU Coder), “Environment Variables”
(GPU Coder), and “Install and Setup Prerequisites for NVIDIA Boards” on page 1-4.
7 NVIDIA Jetson TX2 embedded platform.
To communicate with the NVIDIA hardware, you must create a live hardware connection
object by using the jetson function. To create a live hardware connection object, provide
2-17
2 Deployment
the host name or IP address, user name, and password of the target board. For example
to create live object for Jetson hardware:
hwobj = jetson('192.168.1.15','ubuntu','ubuntu');
The software performs a check of the hardware, compiler tools and libraries, IO server
installation, and gathers peripheral information on target. This information is displayed in
the command window.
hwobj = drive('92.168.1.16','nvidia','nvidia');
2-18
Read Video Files on NVIDIA Hardware
function videoReaderDeploy()
For code generation, the VideoReader function requires the full path to the video file on
the target hardware. The GPU Coder Support Package for NVIDIA GPUs uses the
GStreamer library API to read the video files on the target platform. The software
supports file (container) formats and codecs that are compatible with GStreamer. For
more information, see https://gstreamer.freedesktop.org/documentation/plugin-
development/advanced/media-types.html?gi-language=c. For other code generation
limitations for the VideoReader function, see “Limitations” on page 2-23.
2-19
2 Deployment
cfg = coder.gpuConfig('exe');
cfg.Hardware = coder.hardware('NVIDIA Jetson');
cfg.Hardware.BuildDir = '~/remoteBuildDir';
cfg.GenerateExampleMain = 'GenerateCodeAndCompile';
hwobj.putFile('rhinos', hwobj.workspaceDir);
To generate CUDA code, use the codegen command and pass the GPU code configuration
object along with the videoReaderDeploy entry-point function. After the code
generation takes place on the host, the generated files are copied over and built on the
target.
codegen('-config ',cfg,'videoReaderDeploy','-report');
pid = runApplication(hwobj,'videoReaderDeploy');
A window opens on the target hardware display showing the edge detected output of the
input video.
2-20
Read Video Files on NVIDIA Hardware
2-21
2 Deployment
v = conv2(img(:,:,2),kernel','same');
e = sqrt(h.*h + v.*v);
edgeImg = uint8((e > 100) * 240);
Create a custom main file to handle the variable file name input when running the
executable. A snippet of the code is shown.
//
// Arguments : int32_T argc
// const char * const argv[]
// Return Type : int32_T
//
int32_T main(int32_T, const char * const argv[])
{
//Initialize the application
videoReaderDeploy_initialize();
Modify the code configuration object to include this custom main file.
cfg = coder.gpuConfig('exe');
cfg.Hardware = coder.hardware('NVIDIA Jetson');
cfg.Hardware.BuildDir = '~/remoteBuildDir';
cfg.CustomSource = 'main.cu';
To generate CUDA code, use the codegen command and pass the GPU code configuration
object along with the videoReaderDeploy entry-point function.
2-22
See Also
vfilename = coder.typeof('a',[1,1024]);
codegen('-config ',cfg,'-args',{vfilename},'videoReaderDeploy','-report');
Limitations
• The VideoReader.getFileFormats method is not supported for code generation.
• For the readFrame and read functions, code generation does not support the optional
positional argument native.
See Also
drive | drive | jetson | jetson | killApplication | killProcess | openShell |
runApplication | runExecutable | system
Related Examples
• “Deploy and Run Fog Rectification for Video on NVIDIA Jetson”
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
2-23
2 Deployment
1 Create a connection from the MATLAB software to the NVIDIA hardware. In this
example, the connection is to a Jetson board and is named hwJetson.
hwJetson = jetson('192.168.1.15','ubuntu','ubuntu');
hwJetson =
DeviceAddress: '192.168.1.15'
Port: 22
BoardName: 'NVIDIA Jetson TX2'
CUDAVersion: '9.0'
cuDNNVersion: '7.0'
TensorRTVersion: '3.0'
GpuInfo: [1×1 struct]
webcamlist: []
killApplication(hwJetson,'myAdd')
3 To restart the stopped executable, or to run multiple instances of the executable, use
the runApplication function. For example:
runApplication(hwjetson,'myAdd')
See Also
drive | drive | jetson | jetson | killApplication | killProcess | openShell |
runApplication | runExecutable | system
2-24
See Also
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
2-25
2 Deployment
To communicate with the NVIDIA hardware, you must create a live hardware connection
object by using the drive or jetson function. To create a live hardware connection
object, provide the host name or IP address, user name, and password of the target
board. For example, to create a live object for the Jetson hardware:
hwobj = jetson('192.168.1.15','ubuntu','ubuntu');
During the hardware live object creation checking of hardware, IO server installation and
gathering peripheral info on target are performed. This information is displayed in the
command window as shown.
2-26
Run Linux Commands on NVIDIA Hardware
hwobj = drive('192.168.1.16','nvidia','nvidia');
This statement executes a folder list shell command and returns the resulting text output
at the MATLAB command prompt. You can store the result in a MATLAB variable to
perform further processing. Establish who is the owner of the .profile file under /
home/ubuntu.
output = system(hwobj,'ls -al /home/ubuntu');
ret = regexp(output, '\s+[\w-]+\s+\d\s+(\w+)\s+.+\.profile\s+', 'tokens');
ret{1}
You can also achieve the same result using a single shell command.
You cannot execute interactive system commands using the system() method. To
execute interactive commands on the NVIDIA hardware, you must open a terminal
session.
openShell(hwobj)
This command opens a PuTTY terminal that can execute interactive shell commands like
'top'.
2-27
2 Deployment
1. To run a CUDA executable you previously run on the NVIDIA hardware, execute the
following command on the MATLAB command line:
runExecutable(hwobj,'<executable name>')
where the string '<executable name>' is the name of the CUDA executable you want
to run on the NVIDIA hardware.
2. To stop a CUDA executable running on the NVIDIA hardware, execute the following
command on the MATLAB command line:
killApplication(hwobj,'<executable name>')
This command kills the Linux process with the name '<executable name>.elf' on the
NVIDIA hardware. Alternatively, you can execute the following command to stop the
model:
Manipulate Files
The jetson or drive object provides basic file manipulation capabilities. To transfer a
file on hardware to your host computer,NVIDIA you use the getFile() method.
getFile(hwobj,'/usr/share/pixmaps/debian-logo.png');
img = imread('debian-logo.png');
image(img);
The getFile() method takes an optional second argument that allows you to define the
file destination. To transfer a file on your host computer to NVIDIA hardware, you use
putFile() method.
putFile(hwobj,'debian-logo.png','/home/ubuntu/debian-logo.png.copy');
2-28
See Also
system(hwobj,'ls -l /home/ubuntu/debian-logo.png.copy')
You can delete files on your NVIDIA hardware using the deleteFile() command.
deleteFile(hwobj,'/home/ubuntu/debian-logo.png.copy');
system(hwobj,'ls -l /home/ubuntu/debian-logo.png.copy')
The command results in an error indicating that the file cannot be found.
See Also
deleteFile | drive | drive | getFile | jetson | jetson | killApplication |
openShell | runApplication | system
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
2-29
2 Deployment
hwJetson = jetson;
openShell(hwJetson);
Similarly, you can create a live hardware connection to the DRIVE target and open an
SSH session.
drive | drive | jetson | jetson | killApplication | killProcess | openShell |
runApplication | runExecutable | system
2-30
Open a Secure Shell Command-Line Session with NVIDIA Hardware
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Getting Started with the GPU Coder Support Package for NVIDIA GPUs”
• “Deploy and Run Sobel Edge Detection with I/O on NVIDIA Jetson”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
2-31
3
Verification
The PIL verification process is a crucial part of the design cycle to check that the behavior
of the generated code matches the design. PIL verification requires an Embedded Coder
license.
Note When using PIL execution, make sure that the Benchmarking option in GPU Coder
settings is false. Executing PIL with benchmarking results in compilation errors.
Prerequisites
Target Board Requirements
• GPU Coder for code generation. For an overview and tutorials, see the “Getting
Started with GPU Coder” (GPU Coder) page.
• Embedded Coder.
• NVIDIA CUDA toolkit on the host.
• Environment variables on the host for the compilers and libraries. For information on
the supported versions of the compilers and libraries, see “Third-party Products” (GPU
Coder). For setting up the environment variables, see “Environment Variables” (GPU
Coder).
3-2
Processor-In-The-Loop Execution from Command Line
You do not have to be familiar with the algorithm in the example to complete the tutorial.
The Mandelbrot set is the region in the complex plane consisting of the values z0 for
which the trajectories defined by
zk + 1 = zk2 + z0, k = 0, 1, …
remain bounded at k→∞. The overall geometry of the Mandelbrot set is shown in the
figure. This view does not have the resolution to show the richly detailed structure of the
fringe just outside the boundary of the set. At increasing magnifications, the Mandelbrot
set exhibits an elaborate boundary that reveals progressively finer recursive detail.
3-3
3 Verification
Algorithm
Create a MATLAB script called mandelbrot_count.m with the following lines of code.
This code is a baseline vectorized MATLAB implementation of the Mandelbrot set.
function count = mandelbrot_count(maxIterations, xGrid, yGrid) %#codegen
% mandelbrot computation
z0 = xGrid + 1i*yGrid;
count = ones(size(z0));
3-4
Processor-In-The-Loop Execution from Command Line
z = z0;
for n = 0:maxIterations
z = z.*z + z0;
inside = abs(z)<=2;
count = count + inside;
end
count = log(count);
For this tutorial, pick a set of limits that specify a highly zoomed part of the Mandelbrot
set in the valley between the main cardioid and the p/q bulb to its left. A 1000x1000 grid
of real parts (x) and imaginary parts (y) is created between these two limits. The
Mandelbrot algorithm is then iterated at each grid location. An iteration number of 500 is
enough to render the image in full resolution. Create a MATLAB script called
mandelbrot_test.m with the following lines of code. It also calls the
mandelbrot_count function and plots the resulting Mandelbrot set.
maxIterations = 500;
gridSize = 1000;
xlim = [-0.748766713922161, -0.748766707771757];
ylim = [ 0.123640844894862, 0.123640851045266];
figure(1)
imagesc( x, y, count );
colormap([jet();flipud( jet() );0 0 0]);
axis off
title('Mandelbrot set');
hwobj = jetson('192.168.1.15','ubuntu','ubuntu');
3-5
3 Verification
The software performs a check of the hardware, compiler tools and libraries, IO server
installation, and gathers peripheral information on target. This information is displayed in
the command window.
Checking for CUDA availability on the Target...
Checking for 'nvcc' in the target system path...
Checking for cuDNN library availability on the Target...
Checking for TensorRT library availability on the Target...
Checking for prerequisite libraries is complete.
Gathering hardware details...
Gathering hardware details is complete.
Board name : NVIDIA Jetson TX2
CUDA Version : 9.0
cuDNN Version : 7.0
TensorRT Version : 3.0
Available Webcams : Microsoft® LifeCam Cinema(TM)
Available GPUs : NVIDIA Tegra X2
The --fmad=false flag when passed to nvcc, instructs the compiler to disable Floating-
Point Multiply-Add (FMAD) optimization. This option is set to prevent numerical mismatch
3-6
See Also
in the generated code because of architectural differences in the CPU and the GPU. For
more information, see “Numerical Differences Between CPU and GPU” (GPU Coder).
Verify that the output of this run matches the output from the original
mandelbrot_count.m function.
See Also
drive | drive | getPILPort | getPILTimeout | jetson | jetson | setPILPort |
setPILTimeout | webcam
3-7
3 Verification
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Processor-in-the-Loop Execution on NVIDIA Targets using GPU Coder”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
• “Execution-Time Profiling for PIL” on page 3-17
3-8
Processor-In-The-Loop Execution with the GPU Coder App
The PIL verification process is a crucial part of the design cycle to check that the behavior
of the generated code matches the design. PIL verification requires an Embedded Coder
license.
Note When using PIL execution, make sure that the Benchmarking option in GPU Coder
settings is false. Executing PIL with benchmarking results in compilation errors.
Prerequisites
Target Board Requirements
• GPU Coder for code generation. For an overview and tutorials, see the “Getting
Started with GPU Coder” (GPU Coder) page.
• Embedded Coder.
• NVIDIA CUDA toolkit on the host.
• Environment variables on the host for the compilers and libraries. For information on
the supported versions of the compilers and libraries, see “Third-party Products” (GPU
3-9
3 Verification
Coder). For setting up the environment variables, see “Environment Variables” (GPU
Coder).
You do not have to be familiar with the algorithm in the example to complete the tutorial.
The Mandelbrot set is the region in the complex plane consisting of the values z0 for
which the trajectories defined by
zk + 1 = zk2 + z0, k = 0, 1, …
remain bounded at k→∞. The overall geometry of the Mandelbrot set is shown in the
figure. This view does not have the resolution to show the richly detailed structure of the
fringe just outside the boundary of the set. At increasing magnifications, the Mandelbrot
set exhibits an elaborate boundary that reveals progressively finer recursive detail.
3-10
Processor-In-The-Loop Execution with the GPU Coder App
Algorithm
Create a MATLAB script called mandelbrot_count.m with the following lines of code.
This code is a baseline vectorized MATLAB implementation of the Mandelbrot set.
function count = mandelbrot_count(maxIterations, xGrid, yGrid) %#codegen
% mandelbrot computation
z0 = xGrid + 1i*yGrid;
count = ones(size(z0));
3-11
3 Verification
z = z0;
for n = 0:maxIterations
z = z.*z + z0;
inside = abs(z)<=2;
count = count + inside;
end
count = log(count);
For this tutorial, pick a set of limits that specify a highly zoomed part of the Mandelbrot
set in the valley between the main cardioid and the p/q bulb to its left. A 1000x1000 grid
of real parts (x) and imaginary parts (y) is created between these two limits. The
Mandelbrot algorithm is then iterated at each grid location. An iteration number of 500 is
enough to render the image in full resolution. Create a MATLAB script called
mandelbrot_test.m with the following lines of code. It also calls the
mandelbrot_count function and plots the resulting Mandelbrot set.
maxIterations = 500;
gridSize = 1000;
xlim = [-0.748766713922161, -0.748766707771757];
ylim = [ 0.123640844894862, 0.123640851045266];
figure(1)
imagesc( x, y, count );
colormap([jet();flipud( jet() );0 0 0]);
axis off
title('Mandelbrot set');
1 The app opens the Select source files page. Select mandelbrot_count.m as the
entry-point function. Click Next.
3-12
Processor-In-The-Loop Execution with the GPU Coder App
5 Under the Hardware panel, enter the device address, user name, password, and
build folder for the board.
3-13
3 Verification
6 Close the Settings window and click Generate. The software generates CUDA code
for the mandelbrot_count entry point function.
7 Click Verify Code.
8 In the command field, specify the test file that calls the original MATLAB functions.
For example, mandelbrot_test.
9 To start the PIL execution, click Run Generated Code.
3-14
Processor-In-The-Loop Execution with the GPU Coder App
• Runs the test file, replacing calls to the MATLAB function with calls to the
generated code in the library.
• Displays messages from the PIL execution in the Test Output tab.
3-15
3 Verification
See Also
drive | drive | getPILPort | getPILTimeout | jetson | jetson | setPILPort |
setPILTimeout | webcam
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Processor-in-the-Loop Execution on NVIDIA Targets using GPU Coder”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Execution-Time Profiling for PIL” on page 3-17
3-16
Execution-Time Profiling for PIL
Use the execution-time profile to check whether your code runs within the required time
on your target hardware:
Note PIL execution supports multiple entry-point functions. An entry-point function can
call another entry-point function as a subfunction. However, the software generates
execution-time profiles only for functions that are called at the entry-point level. The
software does not generate execution-time profiles for entry-point functions that are
called as subfunctions by other entry-point functions.
Note When using PIL execution, make sure that the Benchmarking option in GPU Coder
settings is false. Executing PIL with benchmarking results in compilation errors.
1 To open the GPU Coder app, on the MATLAB toolstrip Apps tab, under Code
Generation, click the app icon.
3-17
3 Verification
2
To open your project, click and then click Open existing project. Select the
project.
3 On the Generate Code page, click Verify Code.
4 Select the Enable entry point execution profiling check box.
Or, from the Command Window, specify the CodeExecutionProfiling property of your
coder.gpuConfig object. For example:
cfg.CodeExecutionProfiling = true;
3-18
Execution-Time Profiling for PIL
3-19
3 Verification
The software terminates the execution process and displays a new link.
Execution profiling report: report(getCoderExecutionProfile('mandelbrot_count'))
• A summary.
• Information about profiled code sections, which includes time measurements for:
3-20
Execution-Time Profiling for PIL
By default, the report displays time in ticks. You can specify the time unit and numeric
display format. The report displays time in seconds only if the timer is calibrated, that is,
the number of timer ticks per second is established. For example, if your processor speed
is 2.035 GHz, specify the number of timer ticks per second by using the
TimerTicksPerSecond property. To display time in microseconds (10-6 seconds), use the
report command.
executionProfile=getCoderExecutionProfile('mandelbrot_count'); % Create workspace var
executionProfile.TimerTicksPerSecond = 2035 * 1e6;
report(executionProfile, ...
'Units', 'Seconds', ...
'ScaleFactor', '1e-06', ...
'NumericFormat', '%0.3f')
To display measured execution times for a code section, click the Simulation Data
Inspector icon on the corresponding row. You can use the Simulation Data Inspector to
manage and compare plots from various executions.
The following table lists the information provided in the code section profiles.
Column Description
Section Name of function from which code is generated.
Maximum Longest time between start and end of code section.
Execution Time
Average Execution Average time between start and end of code section.
Time
Maximum Self Maximum execution time, excluding time in child sections.
Time
Average Self Time Average execution time, excluding time in child sections.
Calls Number of calls to the code section.
Icon that you click to display the profiled code section.
3-21
3 Verification
Column Description
Icon that you click to display measured execution times with
Simulation Data Inspector.
See Also
drive | drive | getPILPort | getPILTimeout | jetson | jetson | setPILPort |
setPILTimeout | webcam
Related Examples
• “Sobel Edge Detection using Webcam on NVIDIA Jetson”
• “Processor-in-the-Loop Execution on NVIDIA Targets using GPU Coder”
More About
• “Build and Run an Executable on NVIDIA Hardware” on page 2-2
• “Stop or Restart an Executable Running on NVIDIA Hardware” on page 2-24
• “Run Linux Commands on NVIDIA Hardware” on page 2-26
• “Processor-In-The-Loop Execution from Command Line” on page 3-2
• “Processor-In-The-Loop Execution with the GPU Coder App” on page 3-9
3-22