One of the questions people commonly ask us is:

When will Jacket support LAPACK features such as eigenvalue decomposition, matrix inverse, system solvers, etc.?

The reason this question is so popular is that people recognize that these kinds of problems are well-suited for the GPU and will end up giving great performance boosts for Jacket users. We are looking forward to delivering these functions in Jacket.

Jacket is currently built on top of CUDA. For reasons why we like CUDA, see our previous blog post about OpenCL. While NVIDIA is busy building from CUDA from the ground up, we are busy building Jacket from the top (MATLAB) down. NVIDIA is working hard to promote and develop LAPACK libraries directly into CUDA. So rather than reinvent the wheel, we prefer to work on other things while those lower level libraries are being built.

Once those libraries exist in CUDA, due to Jacket’s modular design, it will be easy for us to “wrap” them into Jacket to deliver them to you within MATLAB.

By the way, in working with NVIDIA on this project and thanks to feedback from many of you on our forums, we listed the following as the highest priority functions to be included in the initial CUDA LAPACK library:

Tier 1 choices -

- getrf
- computes an LU factorization of a general M-by-N matrix A using partial pivoting with row interchanges (http://www.netlib.org/lapack/double/dgetrf.f)

- geqpf
- computes an QR factorization of a general M-by-N matrix A using column pivoting (http://www.netlib.org/lapack/double/dgeqpf.f)

- gesvd/gesdd
- SVD (using divide and conquer)
- GESDD computes the singular value decomposition (SVD) of a real M-by-N matrix A = U * SIGMA * transpose(V)

LU gives direct solve capability for a lot of linear system (A*x = b) plus is also comes handy in iterative methods based on operator splitting. QR is the basis for computing eigenvalues, condition numbers and svd. SVD actually belongs in tier 2 but is so widely used that it deserves tier 1 attention.

Tier2 choices -

- gels/gelsd
- Computes least squares solution to over-determined system of linear equations
- Computes minimum norm solution of an under-determined system
- http://www.netlib.org/lapack/double/dgels.f
- http://www.netlib.org/lapack/double/dgelsd.f

- gglse
- Solves the linear equality-constrained least squares (LSE) problem:
- minimize || c – A*x ||_2 subject to B*x = d, where A is an M-by-N matrix

- http://www.netlib.org/lapack/double/dgglse.f

- Solves the linear equality-constrained least squares (LSE) problem:
- gesv/sgesv
- Solves Ax=b (sgesv – does iterative refinement)
- Computes the solution to a real system of linear equations
- A * X = B, where A is an N-by-N matrix and X and B are N-by-NRHS matrices.

- http://www.netlib.org/lapack/double/dgesv.f
- http://www.netlib.org/lapack/double/dsgesv.f

- ggev
- Computes the generalized eigenvalues, and left and/or right generalized eigenvectors for a pair of nonsymmetric matrices
- http://www.netlib.org/lapack/double/dggev.f

Anyway, hopefully this helps clarify the situation. If you’d like to contribute your input on CUDA LAPACK, the NVIDIA forums is a good place to voice your thoughts on this.

{ 2 comments }

Hi, cool site, good writing

Nice work, is there any more information about the gesdd speedup on GPU, because I can not find this in other gpu-based libraries?

Comments on this entry are closed.