Multi-Target C Code Generation from MATLAB
João Bispo, Luís Reis
João M. P. Cardoso
Faculty of Engineering (FEUP)
University of Porto, Porto, Portugal
Faculty of Engineering (FEUP)
University of Porto & INESC-TEC, Porto, Portugal
{jbispo, ei09030}@fe.up.pt
jmpc@fe.up.pt
Abstract
This paper describes our recent work on MATISSE, a framework
for MATLAB to C compilation. We focus on the new optimizations and transformations, as well as on OpenCL generation.
MATISSE is controlled with LARA, an aspect-oriented language,
able to specify transformations to the input MATLAB code (e.g.,
insertion of code for variable initialization and for monitoring)
and to express information concerning types and shapes of variables. We evaluate the compiler with a set of benchmarks when
targeting both an embedded system and a desktop system. The
results show that we were able to achieve a speedup up to 1.8× by
employing information provided by LARA aspects. We also
compare the execution time of the generated C code with the
original code running on MATLAB, and we achieve a geometric
mean speedup of 19×. The geometric mean speedup reduces to
12× when optimizing the MATLAB code with LARA aspects.
Finally, we present a preliminary version of a fully-functioning
pragma-based OpenCL generator, built over the MATISSE
framework.
Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors – Code generation, Compilers, Optimization,
Retargetable compilers D.2.2 [Software Engineering]: Design
Tools and Techniques
General Terms Performance, Experimentation, Languages
Keywords MATLAB-to-C, source-to-source compiler, Aspectoriented programming, LARA, embedded systems, OpenCL
1. Introduction
MATLAB [1] is a de facto standard high-level programming language and interactive numerical computing environment in many
domains in engineering and science, including embedded computing as it is ubiquitously used by engineers to quickly develop and
evaluate their solutions. MATLAB is dynamically typed, and relies
on interpretation (and/or JIT compilation) as the information
about the types and shapes (i.e., number and size of matrix dimensions) of variables is only known at runtime. Due to advances in
JIT compilation and the use of pre-compiled libraries for the most
intensive functions, the MATLAB runtime environment currently
exhibits acceptable performance.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full
citation on the first page. Copyrights for components of this work owned by others
than ACM must be honored. Abstracting with credit is permitted. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee. Request permissions from Permissions@acm.org.
ARRAY'14, June 11 2014, Edinburgh, United Kingdom
Copyright 2014 ACM 978-1-4503-2937-8/14/06 $15.00.
http://dx.doi.org/10.1145/2627373.2627389
95
In many embedded system settings, however, the use of a
MATLAB runtime environment is infeasible, either because it is
not available, or due to performance and/or resource constraints.
To address this potential shortcoming, a typical solution relies on
the development of an implementation in executable code written
in an imperative language such as C/C++ once the base or original
MATLAB code has been validated. This implementation must then
in turn be validated against the output of the MATLAB code resulting on a lengthy and error prone process that further complicates
the overall application development cycle and cost. The existence
of two code specifications - the original prototypical MATLAB
code and the reference C/C++ code - also exacerbates maintenance costs. Another solution is to rely on the automatic translation of MATLAB to the target programming language as provided, e.g., by the MATLAB Coder [2] and the Embedded Coder [3],
which translate MATLAB to C code. Besides the inherent advantages, this, however, has typically the disadvantage of the low
support to control and guide the code translation. The code generation is typically based on directives (GUI based in the case of the
MATLAB Coder) addressing types, shapes, and target. When
dealing with the myriad of target architectures and toolchains in
embedded systems, this approach presents a low level of flexibility, e.g., as the style of the C code generator might need to be
tuned to the toolchain as is the case when targeting C to hardware
compilers. Instrumentation and code transformations, as well as
an approach to express strategies for code transformations and
instrumentation (e.g., by using a DSL) can be very important
during the design process and may increase productivity. Furthermore, the target platform may require code generators to
specific programming languages as is the case of the generation of
OpenCL [4] when dealing with GPGPUs and/or GPU-based
FPGA implementations [5].
Our approach relies on a compilation tool, named as
MATISSE [6], which generates C code directly from MATLAB.
Our approach explores the use of Aspect-Oriented Programming
(AOP) [7][8] concepts, through the use of the LARA language
[9][10] as a vehicle to convey information to the compiler (e.g.,
types and array shapes) and to express code specialization and
code instrumentation strategies. The compiler uses the userprovided information complementing and checking its consistency
against the information it can derive from its own analysis.
MATISSE is being developed as a modular and flexible compiler
framework, which includes custom Intermediate Representations
(IRs) for the MATLAB and C code, keeping in mind the generation of C code from a higher-level programming language. In
particular, the IR representing the output C code (C-IR) supports
matrix types natively, and can be easily extended to support additional types and language constructs (such as the ones needed to
generate OpenCL). The end result is a synergy between compiler
analysis and the user that allows the compiler to generate very
high-quality code from MATLAB specifications. It is also possible
to generate different versions of the C code, to better target different embedded systems, platforms, and/or toolchains. In this paper
we focus on our recent MATISSE improvements.
The remainder of this paper is organized as follows. Section 2
presents the MATISSE compiler framework. In Section 3 we
describe the C-IR, the internal representation we use to generate C
code. Section 4 explains how the MATLAB code is transformed
into C code and OpenCL with the help of LARA aspects. Section
5 shows some experiments performed using MATISSE. Section 6
describes related work and finally, Section 7 concludes this paper
and describes ongoing work.
implementations from the same source code. A common example
includes the restructuring of source code and the use of statically
declared array variables to be compliant with the requirements of
most hardware compilers.
MATISSE can be used as a source-to-source code transformation and instrumentation tool allowing developers to quickly
and reliably generate reference C implementations, a key step in
the deployment of embedded system applications. The transformation stage of the compiler performs weaving actions such as
insertion of code, definitions of types and shapes, and code specialization based on default values.
2. MATISSE Overview
3. A High-Level C-IR
MATISSE consists of a MATLAB-to-C compiler targeting embedded systems, and a LARA-controlled MATLAB weaver which
allows transformations over MATLAB code. LARA [9] is a
domain-specific language inspired by AOP concepts [7] and
JavaScript semantics and constructs. LARA uses a declarative
semantic to allow programmers to specify strategies for actions
over application source code and/or compiler IRs (e.g., instrument
code, extract information, explore transformations, apply compiler
optimizations). LARA aspects are applied by a target language
dependent weaver, such as the weaver for the MATLAB language
integrated in MATISSE.
Figure 1 presents the overall flow of MATISSE. The input
MATLAB files are parsed and translated to an abstract syntax tree
(AST) based MATLAB IR. The IR is the input to MWeaver,
which uses LARA aspects to modify and add information (e.g.,
variable types and shapes) to the MATLAB IR. MATISSE is
being developed in a way to make easy the integration of code
generators. At the moment code generators for MATLAB and C
are already fully working and an OpenCL code generator is under
development.
Given the differences between MATLAB and C, we decided to
use a C specific AST-based IR to represent the C code (C-IR).
This simplifies the generation of C code and allows us to separate
concerns related to C generation (e.g., include files, variable
declarations) that are not considered in the MATLAB IR. This
option also makes the MATLAB IR clean and independent of the
specificities of the code generation to be applied.
3.1
Variable Types
The C-IR uses the VariableType interface to represent all the
information needed about a data type (e.g., code needed to declare
the variable, how to convert to another type, how to perform an
addition between variables of this type). VariableType allows a
seamless integration of several types in a modular way. For instance, we were able to add support for OpenCL native types
through an implementation of VariableType, without changing the
C-IR library. Figure 2 shows a subset of the hierarchy that starts
with the VariableType interface. To add a new type, one needs to
create a class that implements the VariableType interface. Scalar
represents a single value, and Matrix represents a multidimensional array of elements of type Scalar. These two classes
add contract methods that provide information specific to these
types, such as number of bits, maximum and minimum values or
signed/unsigned, for Scalar, or matrix shape and element type, for
Matrix. CNative represents the native types of C (e.g., int, float, in
the case of Numeric, and, e.g., int32_t, uint8_t in the case of
StdInt).
Figure 1. Overview of the MATISSE compiler framework.
Figure 2. Subset of the VariableType hierarchy in the C-IR.
MATISSE generates MATLAB code for validation, testing,
monitoring, and specialization, and C code to be used by thirdparty design-flows targeting software/hardware systems.
MATISSE is able to generate customized C code for a particular
target without modifying the original MATLAB code. LARA
aspects enable MATISSE to have fine-grained control over the
generation of C code, and also allow the generation of different
96
3.2
Matrix Types in C-IR
Currently, the C-IR has two Matrix implementations, StaticMatrix, which uses statically allocated C arrays (e.g., int A[3]), and
DynamicMatrix, which uses a C structure to represent dynamically allocated arrays of a given element type. For dynamic arrays,
MATISSE allows the option to use for each array access a func-
tion that performs array bounds checking (for debugging), or to
use inlined code without checks (for performance).
MATISSE uses linearization of multi-dimensional arrays,
whereby an element of the multi-dimensional array is accessed
through a single pointer variable. Linearization has several benefits. Firstly, simple element-wise operations are compactly executed in a single loop rather than using a loop nested structure.
Secondly, the single allocation of storage and corresponding
boundary values also enables one out-of-bound condition check
per array access rather than having to perform a verification per
array dimension. Lastly, it also provides other advantages regarding the size of the code generated. When multi-dimensional arrays
are generated without linearization, code may need to consider
various pointers (when dimensions are not known at compilation
time) in order to allocate the space needed for all dimensions.
4. Transforming MATLAB to C
MATLAB is a dynamically typed language. This is in stark contrast with C, which is statically typed and needs the types of all
variables to be declared. When converting MATLAB to efficient C
code, it is necessary to statically determine the types used. This
can be a challenge, as the same MATLAB function can have very
different C implementations, depending on the types of its variables. Fortunately, defining the types of the arguments/parameters
of a MATLAB function is often enough to infer the remaining
types of variables in the function. In the case of static arrays,
usually it is also possible to determine the shape of the arrays at
compile time (i.e., how many dimensions the array has, as well as
the size of each dimension).
MATISSE uses an interface that represents all possible C implementations for a given named MATLAB function and selects
one of them based on the received arguments and types. The
specialization occurs at the level of the function call, instead of
the function. This mechanism allows MATISSE to generate multiple versions of highly specialized C functions. For instance, if
there are multiple function calls to a MATLAB function that accepts a matrix type as input, and it receives different StaticMatrix
types (a type whose shape is known at compile time), it is possible
to generate a function specialized to each specific shape.
4.1
Type and Shape Inference Analysis
The type inference uses a simple data-flow analysis approach
[11], where type information is derived by processing each
MATLAB statement, complemented with information provided by
LARA aspects. In most cases only the types of function parameters are required for the compiler to achieve an efficient type
inference. There are two general situations where type-inference
is applied in MATISSE: 1) during a function call; and 2) during
assignments. C-IR nodes representing function calls contain
information about the signature of the functions (i.e., their input
and output types), and are specialized according to the inputs of
the call. The output types are usually determined by the object that
creates the function call, and this means that each function can
define its own rules regarding the type-inference of its output
types. For the assignments, the variables on the left hand are
usually bound to the type inferred in the right hand. Note that
when in conflict, types of variables defined in a LARA aspect
always override the inference mechanism. Besides the type, variables can carry other information, such as values and the shapes of
matrices. This information is propagated and in many cases updated and extended by information determined in other assignments.
Consider the code in Figure 3. In the first line, the function size
returns an array with the shape of the given variable. If the varia-
97
ble H is of the type StaticMatrix, it always contains information
about its shape, and the values of h1 and h2 are known at compile
time. If H is a DynamicMatrix, the values of h1 or h2 might not be
known. Size is a supported MATLAB function, and MATISSE
creates a FunctionCall node in C-IR, specialized to the type of H.
C-IR nodes always carry information about the types they return.
Thus, MATISSE assigns the types defined by the FunctionInstance of size to h1 and h2 (in this case, both are of type int).
However, if the types of h1 and h2 are defined in a LARA aspect,
MATISSE uses those types. If the shape of H is known at compile
time, the values of h1 and h2 are also known.
In the second line in Figure 3, for operations such as + there is
a default rule that chooses the first maximal fit type between the
operands. The maximal fit is automatically determined using
information obtained from the Scalar interface (i.e., the minimum
and maximum possible values of the type). Although MATISSE
infers types for constants (e.g., 1 is inferred as an integer, 1.0 as a
double), for this rule the types of constants are not taken into
account when inferring the type of the output (unless all operands
are constants). In this case, the result of the addition will have the
same type as h1. The operator / has special inference rules. By
default, the output is assigned to a real type, to avoid losing precision. The flexibility of MATISSE allows, on the one hand, general rules available to all functions, and on the other hand, custom
rules for a particular function. As previously referred, if a LARA
aspect defines the type of any of the variables (i.e., H, h1, h2 or
offset1) the inference mechanism for that particular type is augmented with that information. For instance, if the type of the
variable offset1 is set to float, the operations on the right-hand of
the assignment will also consider float as output. This way it is
possible to address the limitations of a static type/shape inference
analysis, and the usual cases where the user needs to force and
evaluate data types not derived by type inference.
[h1,h2] = size(H);
offset1 = (h1+1)/2;
Figure 3. Code snippet from conv2.
4.2
C Code Generation Example
We now illustrate the application of the proposed approach to a
MATLAB function implementing an FIR (Finite Impulse Response) filter (see Figure 4). This function takes as input two
arrays, vector_1d and coef, and outputs an array named output. In
the absence of information about the shape of input arrays,
MATISSE generates C code that uses the DynamicMatrix type for
the parameter as well as for the function’s return value, which
represents a structure with dynamically allocated memory. The
definition of types and shapes for the function parameters in a
LARA aspect (see Figure 5) enables the use of the StaticMatrix
type.
The MATLAB code in Figure 4 uses the MATLAB built-in
function sum. Currently, MATISSE supports a general version of
sum, by using a description in MATLAB translated to C by
MATISSE. With information about the types of the arguments of
sum, we can apply transformations over the code. Figure 6 shows
a possible transformation, which is applied as follows. The function sum is called with an expression as argument. By analysing
the MATLAB-IR corresponding to the expression, MATISSE
determines that it is composed only by element-wise operations
(i.e., .*). Furthermore, the operands in the expression return onedimension arrays (coef is used directly and we know the shape,
and vector_1d is accessed linearly, returning a one-dimension
array). The output of sum will be a scalar, and can be replaced
with an accumulator variable (sum_acc) and a for loop. If there
were no ranges, we could iterate the loop over the size of any of
the input matrices. As there is a range, we use it to control the for
loop: (i:-1:i-NTAPS+1). If there were any other ranges,
MATISSE would calculate the indexes outside the for and would
access them with an induction variable. The matrix coef is used as
parameter, so the compiler uses an induction variable (matrix_i) to
access the elements of coef and incremented after each loop iteration. This is an example of a transformation applied with the
current framework, at the MATLAB-IR level. In this case, the
output of the transformation is a modified MATLAB-IR, but we
can perform the same transformation to generate directly C-IR.
This transformation in particular will slow down the code if executed in MATLAB, but can significantly help to generate efficient
C code (see Figure 7).
function output=fir_1d(vector_1d, coef)
NTAPS = numel(coef);
N = numel(vector_1d);
output = zeros(1, N);
for i = NTAPS:1:N
output(i) = sum(vector_1d(i:-1:i-NTAPS+1)
.* coef);
end
end
Figure 4. MATLAB fir code example.
aspectdef firSingle
var typeDef = { // Type definition
vector_1d
: "single[1][1024]",
coefficients : "single[1][32]",
output
: "single" };
// Matrix sizes
var matrixSizes = {output : "1, N" };
// Define types
call defineTypes("fir_1d", typeDef);
// Define matrix sizes
call initMatrixes("fir_1d", matrixSizes);
// Inline all functions MATISSE supports
call matisseInline("true");
// Define the matrix implementation as static
call matisseMatrixImpl("static")
end
We present the OpenCL engine to illustrate the overall flexibility of the framework, and of the C-IR. In a few months’ work, a
single Master student was able to develop an OpenCL engine
capable of generating the OpenCL code for sections in a relevant
subset of MATLAB annotated with pragmas, plus the necessary
wrapper classes that perform the communication between the C
and the OpenCL code (both codes are represented using the CIR). The OpenCL engine replaces pragma occurrences with a call
to a custom function. It then translates that function to OpenCL.
The OpenCL generation is still in a very early phase and the
currently generated code is not optimized.
float* fir(float input[1024], float coef[32],
float output[1024])
{
int NTAPS; int N; int i; float sum_acc;
float sum_acc; int matrix_i; int sum_i;
NTAPS = 32; N = 1024;
zeros_f1x1024(output);
for(i = NTAPS; i<=N; i = i+1){
sum_acc = 0.0f; matrix_i = 1;
for(sum_i = i; sum_i>=(i-NTAPS)+1; sum_i--){
sum_acc = sum_acc+(vector_1d[sum_i-1] *
coef[matrix_i-1]);
matrix_i = matrix_i+1;
}
output[i-1] = sum_acc;
}
return output;
}
Figure 7. C code with static matrices for the fir function generated by MATISSE with information about shapes.
5. Experimental Results
We carried out a series of experiments to evaluate the impact of
the information introduced by aspects on the performance of the
generated C code, when executed on an embedded platform. We
also compared the execution time of the generated code against
the original MATLAB code, when using a desktop PC. Finally,
we show some preliminary results obtained by an early version of
the OpenCL generator.
Figure 5. LARA aspect that defines the types for the fir example.
5.1
function output=fir_1d(vector_1d, coef)
NTAPS = numel(coef);
N = numel(vector_1d); output = zeros(1, N);
for i = NTAPS:1:N
sum_acc = 0; matrix_i = 1;
for sum_i = i:-1:i-NTAPS+1
sum_acc = sum_acc + vector_1d(sum_i)
.* coef(matrix_i);
matrix_i = matrix_i + 1;
end
output(i) = sum_acc;
end
end
Figure 6. MATLAB fir after transformation.
4.3
OpenCL Generation
We are developing an OpenCL generator which uses OpenACC
[12] based pragmas in the MATLAB code to decide what/how to
parallelize. Sections to parallelize begin with the “parallel loop”
pragma. The “end” pragma indicates the end of the code section
the pragma applies to. Parallel loop sections can be parameterized
with copyin, copyout and/or reduce. Two types of reductions are
supported: sums and products.
98
Methodology
We use MATISSE to automatically derive C code corresponding
to kernels written in MATLAB. We then compare the execution
time after compiling the resulting code to two architectures: 1) A
desktop PC with a 2.93GHz Core 2 Duo processor and Windows
7 32-bit, 3GB of RAM and an nVidia Quadro NVS-290; 2) A
BeagleBoard-XM revB running Ubuntu 12.10 32bit, with a 1GHz
ARM Cortex-A8 and 512MB of RAM.
We use as benchmarks a set of kernels we consider relevant
for embedded systems. We include subband and grid_it, two
critical functions from the 3D Path Planning and the MPEG audio
encoder applications [10]. In addition, we include an application
to perform correlation using FFTs and with 3D matrices as input.
This application named cfd, uses forward and inverse 2D FFTs
provided by a MATLAB function able to perform N-dimensional
FFTs (identified as fft2d), and a dot product between 3D matrices
(identified as cpx). Table 1 contains all the benchmarks used.
For both architectures, the generated C code was compiled
with gcc-4.6, using flag –O2 and the OpenCL code was compiled
with the AMD APP SDK v2.9. Besides the code of the functions,
MATISSE also generates main functions for testing purposes,
when specifying a .M or .MAT file with the values of the input
arguments of the function to test. All benchmarks output correct
results when compared with MATLAB original output, up to an
error of 10E-6.
Table 1. Benchmark characteristics.
Benchmark
cfd
conv
cpx
dilate
fft2d
fir1d
grid_it
latnrm
subband
Input Sizes
256 × 256 × 3
96 × 11
512 × 512 × 10
2048 × 2048 , 2
256 × 256
1M × 32
32 × 64 × 16
32K × 8
128 × 64ki
MATLAB LoCs
50
73
23
17
124
14
38
29
35
the MATLAB code does not use appropriate idiomatic constructs,
and that slowdowns the execution in MATLAB. In the case of
fir1d and subband, we are already using MATLAB built-in functions (e.g., sum), and the impact of pre-allocation does not completely explains the speedups (fir1d reduces to 63×; subband
increases to 33×). The speedups come from the transformation
presented in Section 4.2. We did an experiment where we implemented our own MATLAB version of sum, and used it to generate
C code instead of doing the transformation. In this case, the subband speedup was reduced to 2× (MATISSE could not generate C
code using the custom sum for the other benchmarks where sum is
used, such as fir1d).
38
#!5
$!
3$$
%$
#4
#5
5.2
Results
#!
We consider two options for generating C code: aspects (minimal/optimized) and function inlining (enabled/disabled). Minimal
aspect refers to the minimum information we have to provide to
generate C code that outputs correct results (sometimes we have
to define the type of the output, or of some intermediate variables), while optimized is an aspect that has been tailored to provide specialized code.
Figure 8 shows the impact on performance by each option.
Function inlining provides the highest speedups, on average a
speedup of 1.6× for the tested codes (with speedups up to 2.2×
and 2.6×, for grid_it and conv, respectively). This was expected as
inlining removes the overhead of calling a function and enables
further optimizations. In the case of the inlining of get/set functions, we are trading array-bound checking for performance.
Specialization achieved a more modest effect, on average a
speedup of 1.2×, with a maximum of 1.8× for subband. When
specializing with aspects, we are replacing certain types with
potentially less expensive types (doubles with floats, floats with
integers), and reducing the overhead of type casting. In architectures more sensitive to data types a larger impact is expected (e.g.,
when double precision is supported by software).
%"!
&'()*)+,-
./0)/,-
&'()*)+,-121./0)/,-
$"!
3"%
3"4
5"7
5"6
5
!"7
!
Figure 9. Speedups when comparing the execution of MATLAB
code with the C code generated by MATISSE.
Table 2 shows the speedups achieved by the code generated
with the OpenCL engine, when compared with the execution of
the original MATLAB code, running on MATLAB, and the C
version generated by MATISSE without OpenCL directives. From
the set of benchmarks, we chose those whose loops could be
parallelized and currently supported by the OpenCL generator.
The OpenACC based pragmas were manually added to the
MATLAB code. The generated OpenCL code run in the nVidia
Quadro NVS-290 of the desktop PC.
We achieved a speedup of 1.37× on dilate over C-only code,
after increasing the size of the inputs (from a 512×512 image to
1024×1024). For cpx and subband, the OpenCL implementation
could not beat the pure C code currently generated by MATISSE.
These results were expected as our current OpenCL generator is
not able to take fully advantage of the target GPUs.
Table 2. Speedup of C+OpenCL compared with execution in
MATLAB and execution of C-only MATISSE on Core 2 Duo.
#"!
Benchmark
dilate 1024 × 1024
cpx
subband
!"!
MATLAB
17.5
18.8
1.9
C-only MATISSE
1.37
0.19
0.05
6. Related Work
Figure 8. Speedups on the BeagleBoard of the C code when
considering three options.
Figure 9 presents the speedups of the C code generated by
MATISSE when compared over the execution of the original
MATLAB code when using MATLAB 2012b, on the Core 2 Duo.
Significant speedups were achieved for most cases, ranging from
5× to 9×. We measured a slowdown of 0.6× for fft2d. The slowdown is related to missing opportunities for optimizing the code
of the function. We also achieved very high speedups. E.g., for
cpx, the main reason is the pre-allocation of matrices inserted by
the LARA aspect. If we compare to the execution time in
MATLAB after the code is transformed according to LARA
specifications, the speedup reduces from 105× to 5× (the geometric mean reduces from 19× to 12×). For latnrm, we consider that
99
Given the importance of MATLAB there have been research efforts to improve the execution of JIT MATLAB compilers. A
recent example is the compiler presented in [13] which performs
function specialization based on the runtime knowledge of the
types of the arguments of the functions. Given the widespread of
the use of MATLAB to develop embedded systems and the hardware constraints of such systems that precludes the use of a
runtime MATLAB environment, an important aspect is the automatic translation of MATLAB programs into equivalent C code.
DeRose and Padua developed the FALCON environment [14]
that translates MATLAB to FORTRAN90 code. They leverage an
aggressive use of static and type inference for base types (doubles
and complex) as well as shape (or rank) of the matrices. Other
researchers have explored the reuse of storage for array variables
across a MATLAB code thus reducing the memory footprint of the
corresponding C reference code [15]. Joisha et al [16] focused on
type and shape inference techniques. Researchers have also relied
on a mix of type inference approaches and user’s provided information. For instance, [17][18] use annotations to specify data
types and shapes and simple type inference analysis and target
VHDL code for hardware synthesis onto FPGAs. We specifically
note that the focus of our approach is mostly on embedded implementations of the MATLAB programs. In this context, an efficient translation to an implementation language (mainly C) is
needed. One of the possibilities is to consider a subset of
MATLAB allowing feasible and efficient static compilation. Examples using such a subset are the Matlab Coder [2] and the
Embedded Coder [3].
The popularity of the MATLAB language is also reflected in
the similar languages that have been proposed. Examples of those
languages are Scilab [19] and Octave [20]. A Scilab to C translator [21], named Sci2C, has been developed. Sci2C focus entirely
on embedded systems, and is completely dependent on annotations embedded in the Scilab code to specify data sizes and precisions. Our compiler distinguishes from Sci2C as it is able to generate C code without polluting the original code. Furthermore,
Sci2C requires that the size of arrays is fixed and statically known
while our compiler also produces C code when those sizes are
unknown. The use of user-specified rules and strategies for code
transformations has been used to optimize Octave programs [22].
with loop vectorization and partial evaluation of types and values.
In this work we describe a mechanism for conveying information about types and shape/rank similar in spirit with the notion
of Aspects [7]. Previous work has proposed aspect-oriented extensions to MATLAB and an aspect-oriented code transformation
language for MATLAB [23]. Other authors have explored aspectoriented approaches for MATLAB [24], but do not use aspects to
specify complementary information that can be used by compilers
to produce more efficient implementations.
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
7. Conclusion
This paper presented the current status of MATISSE, a compiler
infrastructure for MATLAB. MATISSE relies on LARA aspects
for specifying data types, shapes, and code instrumentation and
specialization, and on the C-IR for type inference and C code
generation. We presented the general flow of the tool, and described the possible transformations that can be applied and the
optimizations performed by the compiler. The experiments reveal
promising performance results, achieving a geometric mean
speedup of 12× over execution in MATLAB when considering 9
benchmarks. Additionally, we described our first steps on
OpenCL generation from MATLAB. Our OpenCL generator takes
advantage of OpenACC-based directives to decide about the
parallelization and about the MATLAB code sections to be
mapped to the accelerator. Ongoing work is focused on further
optimizing the C generator and on evaluating and optimizing the
OpenCL generator.
[16]
[17]
[18]
[19]
[20]
[21]
[22]
Acknowledgments
This work was partially supported by Fundação para a Ciência e
a Tecnologia (FCT) under FEDER/ON2 and FCT project
NORTE-07-124-FEDER-000062.
References
[1]
MATLAB – the Language of Technical
http://www.mathworks.com/products/matlab
[23]
[24]
Computing,
100
MATLAB Coder: Generate C and C++ code from MATLAB code,
© 2012 The MathWorks, Inc.
Embedded Coder: Generate C and C++ code optimized for embedded systems, © 2014 The MathWorks, Inc.
The OpenCL Specification, Version: 1.0, Doc. Rev.: 48, Khronos
OpenCL Working Group, Editor: Aaftab Munshi, Last Rev. Date:
10/6/09.
T. S. Czajkowski, et al., “From opencl to high-performance hardware on FPGAs,” 22nd Int’l Conf. on Field Progr. Logic and Applications (FPL’12), Oslo, Norway, Aug. 29-31, 2012, pp. 531-534.
J. Bispo, et al., “The MATISSE MATLAB Compiler - A MATrix(MATLAB)-aware compiler InfraStructure for embedded computing SystEms,” in IEEE Int’l Conf. on Industrial Informatics
(INDIN’13), Bochum, Germany, 29-31 July 2013, pp. 602-608.
G. Kiczales, et al., “Aspect-Oriented Programming,” In Proc. European Conference on Object-Oriented Programming (ECOOP'97),
Springer-Verlag, LNCS 1241, June 1997, pp. 220-242..
J.D. Gradecki, and N. Lesiecki, Mastering AspectJ: Aspect-Oriented
Programming in Java, John Wiley & Sons, Inc., NY, USA, 2003.
J.M.P. Cardoso, et al., “LARA: An Aspect-Oriented Programming
Language for Embedded Systems,” in Proc. Int. Conf. on AspectOriented Software Development (AOSD’12), Potsdam, Germany,
March 25-30, 2012, pp. 179-190.
J.M.P. Cardoso, P. Diniz, J.G. Coutinho, and Z. Petrov (eds.), Compilation and Synthesis for Embedded Reconfigurable Systems,
Springer, May 2013.
A. Aho, J. Ullman, M. Lam, and R. Sethi, Compilers: Principles,
Techniques and Tools, Addison Wesley, 2006.
The OpenACCTM Application Program Interface, August 2013.
Version: 2.0a, © 2011-2013 OpenACC-Standard.org.
M. Chevalier-Boisvert, L. Hendren, and C. Verbrugge, “Optimizing
MATLAB through Just-In-Time Specialization,” in Int. Conf. on
Compiler Construction (CC’10), March 2010, pp. 46–65.
L. De Rose, and D. Padua, “Techniques for the Translation of
MATLAB programs into Fortran 90,” in ACM Trans. Program.
Lang. Syst., 21, 2 (Mar. 1999), pp. 286–23.
P. Joisha, and P. Banerjee, “Static array storage optimization in
MATLAB”, in Proc. ACM Conf. on Prog. Language Design and
Implementation (PLDI’03), June 9-11, 2003, San Diego, CA, USA,
pp. 258-268.
P. Joisha, and P. Banerjee, “An algebraic array shape inference
system for MATLAB,” in ACM TOPLAS, 2006; 28(5), pp. 848–
907.
A. Navak, M. Haldar, A. Choudhary, and P. Banerjee, “Parallelization of MATLAB Applications for a Multi-FPGA System”, in Proc.
9th IEEE Symp. on Field-Programmable Custom Computing Machines (FCCM'01), Rohnert Park, CA, USA, May, 2001, pp. 1-9.
P. Banerjee, at al., “Automatic Conversion of Floating Point
MATLAB Programs”, in Proc. 11th IEEE Symp. on FieldProgrammable Custom Computing Machines (FCCM’03), Napa,
CA, USA, 2003.
Scilab, http://www.scilab.org/
The Octave Home Page. http://www.gnu.org/software/octave/
Scilab 2 C - Translate Scilab code into C code,
http://forge.scilab.org/index.php/p/scilab2c/
K. Olmos, and E. Visser, “Turning dynamic typing into static typing
by program specialization in a compiler front-end for Octave,” in
Proc. 3rd IEEE Int. Workshop on Source Code Analysis and Manipulation (SCAM’03), 26-27 Sept. 2003, pp. 141-150.
J.M.P. Cardoso, et al., “A Domain-Specific Aspect Language for
Transforming MATLAB Programs,” in Domain-Specific Aspect
Language Workshop (DSAL’2010), part of AOSD’2010, March 1519, 2010, Rennes & Saint Malo, France.
T. Aslam, J. Doherty, A. Dubrau, and L. Hendren, “AspectMatlab:
An Aspect-Oriented Scientific Programming Language”, in Proc.
Aspect Oriented Software Development Conference (AOSD), March
2010, ACM, NY, USA, pp. 181-192.