Tutorial : Use CUDA and C++11 Code in MATLAB

As it turns out, incorporating CUDA code in MATLAB can be easily done! ūüôā

MATLAB provides functionality for loading arbitrary dynamic libraries and invoking their functions. This is especially easy for invoking C/C++ code in a MATLAB program. Such functionality is possible using the so called MEX functions.


Mex functions can be created with the mex command in MATLAB. Essentially, mex takes as input a C/C++ source file, invokes the default C/C++ compiler installed in the operating system (GCC or CL), and creates a mexa64 file (on a 64-bit machine) which can be used like any other MATLAB function.

The C/C++ file that is passed to mex must have the following included in it:

#include "mex.h" // The mex header containing the necessary interop definitions

 * The "gateway" function which is the entry point for the MATLAB
 * function call (will be executed when the mex file is invoked in
void mexFunction(int nlhs, mxArray *plhs[],
                 int nrhs, const mxArray *prhs[])


The arguments that are passed from MATLAB are accessible using the prhs parameter (which stands for parameters-right hand side). Any output that the gateway function generates can be returned using the plhs parameter (which stands for parameter-left hand side). The number of the arguments that are passed to the gateway function is stored in the nrhs parameter and the number of outputs that the MATLAB code expects from the gateway function is stored in the nlhs parameter. From this point on, I refer to the file containing the above code as the mex gateway file. Also, I will refer to the mexFunction above as the gateway function.

Compilation and Building:

There is a short tutorial on MathWorks website on how to use CUDA inside a mex function, but I find it lacking as it can mostly be used as an ad-hoc solution. Complicated and custom built CUDA code with lots of dependencies will be very difficult to adapt to the method suggested by MathWorks. Furthermore, as of now (MATLAB R2014a), mex is incompatible with CUDA toolkits of v5.5 and above, therefore the provided solution will not work at all.

Luckily, there’s an easy way around that. While¬†searching the net, I came¬†across this question on StackOverflow. The idea is to convert the existing CUDA code into a dynamic library file (*.dll or *.so), add the compiled library as a dependency to the mex gateway file and finally let the mex¬†command handle the final compilation and linking. In other words, the resulting mex function simply invokes the gateway function which would invoke some other entry function¬†in the dynamic library file and pass it the information it requires. When the secondary entry function¬†has done its job, it will pass its results back to the gateway function. The gateway function would then copy the results in the plhs parameter and pass them back to MATLAB!

I assume you already know how to compile a CUDA project into a dynamic library. Note that you also need to create a header file for the entry points in the compiled library so that the header can be included in the mex gateway file.

Let libmyLib.so be the name of the compiled library. Also let myGatewayFile.cu be the mex gateway file that has the above content. You then need to navigate to the directory containing these files in MATLAB and invoke the following command:

!nvcc -O0 -std=c++11 -c myGatewayFile.cu -Xcompiler -fPIC -I/usr/local/MATLAB/R2014a/extern/include -I/usr/local/MATLAB/R2014a/toolbox/distcomp/gpu/extern/include -L./ -lmyLib;
mex -g -largeArrayDims myGatewayFile.o -L/usr/local/cuda/lib64 -L/usr/local/MATLAB/R2014a/bin/glnxa64 -lcudart -lcufft -lmwgpu -L./ -lmyLib -lstdc++

¬†NOTE: “!” in MATLAB means that whatever command comes afterwards is to be sent to the¬†OS command line.

In the above code, we are simply invoking¬†the NVIDIA C Compiler (nvcc) and asking it to create an object file¬†for the mex gateway file “myGatewayFile.cu”.¬†Note that we are also specifying libmyLib.so¬†as a dependency (using the -l switch — That’s a small case L). Then, we are asking mex to link the object file with all of its dependencies and create a mexa64 file. Any¬†additional libraries that the gateway function requires can be specified here.

The result of this process is a mex-function which can be invoked just like any other MATLAB function. Note that the above code permits using C++11 commands and syntax in all of the files. In fact, I developed a CUDA library using CUDA C and C++11 and built a mex file off that and everything is working perfectly! ūüôā Also, the optimization flags (-O3 or -O0) can be passed here as well.

Other Tips about the MEX Gateway Function:

Here, I’ll provide some handy tips about¬†some MEX API functions that are useful in the gateway function.

  • Print a message (aka printf):
    mexPrintf("This is message number %d", intNumber);


  • Throw error general errors (eg. for when arguments are mismatched):
    mexErrMsgTxt("Input should be only a single matrix \n");


  • Get the type of an input argument:
    if (mxGetClassID(prhs[0]) != mxSINGLE_CLASS)
    		mexErrMsgTxt("Input image must be of type SINGLE\n");


  • Get the size of each dimension of an input argument:
    const mwSize *dims = mxGetDimensions(prhs[0]);


  • Get the number of the¬†dimensions of an input argument:
    mwSize ndim = mxGetNumberOfDimensions(prhs[0]);


  • Get a scalar value from input parameters:
    int filterSize = (int)((double)mxGetScalar(prhs[1]));


  • Get the data of an input argument (MATLAB passes the data in column-major format!)

    float* image = (float*)mxGetData(prhs[0]);


  • Create 2D¬†output array in MATLAB:
    plhs[0] = mxCreateNumericMatrix(1, length, mxSINGLE_CLASS, mxREAL); // Creates a 1xlength matrix of type float (or Single in MATLAB)


  • Create n-D output array in MATLAB:
    plhs[0] = mxCreateNumericArray(ndim, dims, mxSINGLE_CLASS, mxREAL);


Finding all these information took a considerable amount of time. I hope you enjoy the tips here and find them useful! ūüėČ

1 comment

    • chris on December 4, 2019 at 8:50 AM
    • Reply

    Wow, this saved me a lot of trouble. So there’s no need to buy the parallel processing toolbox from mathworks at all.

Leave a Reply

Your email address will not be published.