Over the last couple of months we have participated in some pretty amazing stories about dramatic speed ups for customer M-codes. For example, one customer went from 400 minutes of run time to 20 seconds and then further optimization dropped their runtime from 20 seconds to 65 milliseconds! This signifies a greater than 1000x performance improvement from the original code. Certainly, a runtime difference like that can make the difference in solving a problem versus not even attempting the effort.
When performance is critical to your problem, software platforms like Jacket can help to leverage hardware like GPUs, but additionally, vectorization of MATLAB® code can and will make a huge difference in the runtimes of your scripts whether operating in single threaded, multi-threaded, CPU cluster or GPU computing mode. Vectorization may take time and effort but when performance is really important it is more than likely worth the effort.
MATLAB, parallel computing, and GPU computing all perform best on vectorized code. They all take advantage of the inherent parallelism of the M-language which is extremely powerful when utilized wisely. There are numerous sources of information available on the internet to learn about vectorization and to obtain vectorization examples,
Mathworks – Code Vectorization Guide
http://www.mathworks.com/support/tech-notes/1100/1109.shtml
MATLAB Tutorial at Cyclismo
http://www.cyclismo.org/tutorial/matlab/vector.html
Improving the Speed of MATLAB Calculations – Portland State University
http://web.cecs.pdx.edu/~gerry/MATLAB/programming/performance.html
matlab tips and tricks and …
http://www.ee.columbia.edu/~marios/matlab/matlab_tricks.html
In addition to these generally available resources to help in vectorizing M-code, AccelerEyes has begun (just begun) to build a library of vectorization examples that we are confident will help Jacket users dramatically improve their code. We will continue to build on this library of examples so bookmark the wiki page and visit often.
http://wiki.accelereyes.com/wiki/index.php/Code_Vectorization_Examples
AccelerEyes also plans to put together training materials and tutorials to help programmers learn the art of what we’re calling hardware independent data-parallel programming. The overall idea is that whereas in the past, programmers were required to write low-level code (assembly, CUDA, etc) to realize performance, today the same gains can be made by instead writing at a higher level (vectorized for instance) with the added plus that in doing so, coders aren’t tying their applications down to a particular piece of hardware. Stay tuned as we will post when these materials become available.
Furthermore, we would encourage you to participate in our User Forum and provide segments of your code where you are trying to improve performance – particularly for-loops. Our team along with other Jacket users may be able to provide you with some vectorization assistance to help not only your CPU performance but what you can get from the GPU leveraging Jacket.
