MATLAB® parallel for loops (parfor) allow the body of a for loop to be executed across multiple workers simultaneously, but with some pretty large restrictions. With Jacket MGL, Jacket can be used within parfor loops, with the same restrictions. However, it is important to note that Jacket MGL does not currently support co-distributed arrays.
Problem size might be the single most important consideration in parallelization using the Parallel Computing Toolbox (PCT) and Jacket MGL. When data is used by a worker in the MATLAB pool it must be copied from MATLAB to the worker, and must be copied back when the computation is complete. Additionally, when GPU data is used, it must then be copied by the worker to the GPU and back. With small, simple problems, parallelization may not offer a performance improvement because of the overhead of this data movement. Unfortunately, there is no simple answer to the question of “What size problem justifies multi-gpu parallelization?” We have found that experimentation provides the best answer.
One of the ways to reduce the overheard of parallelization is to minimize the amount of data being copied. This can be done through array slicing. When an array is used with only a single variable dimension, the array can be “sliced” so that ONLY the relevant data is copied. For example, with a 128×128 array, there would be 16,384 entries. If the array were used by a worker with both dimensions being referenced with variables, the entire array must be copied. When only a single dimension is used, MATLAB only needs to copy the elements that the worker could possible refer to. In the example here, only 128 elements would need to be copied instead of 16,384.
for/gfor/parfor in parfor
It’s possible to use the various types of for loops embedded inside a parfor loop, but they cannot appear directly in the body of the loops. Instead, they must be inside of a function called by a loop. Here is a simple example:
parfor i=1:8 for j=1:8 m(i,j) = i*j; end end
The above example fails because of transparency. Other than the iterator variable of the parfor, MATLAB must be able to evaluate indices at the time the code is read, not at the time the code is executed. We can fix this example by changing it to:
function x = f(n, i) m = n; j = ; gfor j=1:8 m(j) = i*j; gend end m = gones(8); parfor i=1:8 m(i,:) = f(m(i,:), i); end
parfor in a gfor
It is not possible to use a parfor loop or an spmd command inside of a GFOR.
Inside of a parallel loop, MATLAB attempts to classify each variable. The different classes are:
- Temporary variable: a variable created inside a single iteration whose lifespan is only that iteration
- Reduction variable: a variable which accumulates an its value across all iterations, but whose value make be computed regardless of the order of execution of the iterations.
- Broadcast variable: a variable declared outside of the loop and referenced in the loop, but whose value is not changed within the loop
- Sliced variable: a variable whose elements are operated on by independent iterations of the loop
- Loop variable: the iterator of the loop
If a variable cannot be classified into one of these categories, the loop will not be able to execute. Further restrictions on the ways variables may be classified and used are explained in the Advanced Parfor Topics on The MathWorks web site at:
Understanding Some of the Restrictions
Things that are not compatible with parallelization of for loops:
- Loops may not contain spmd statements.
- Loops may not contain global or persistent variable declarations.
- Loops may not contain break or return statements
For more information about the MATLAB-imposed restrictions on parallel for loops, visit: