Streaming data to the GPU

by James Malcolm on February 1, 2010

in CUDA

We just got off the phone with a finance firm wanting to using Jacket to process market data as it streams in. Essentially, they want to avoid loading memory into MATLAB before sending data out to the GPU. We’ve had similar requests from defense companies streaming radar and other signals, from healthcare firms dealing with medical image processing, from companies doing video image processing, and more.

Probably the best way is to use the Jacket SDK to push data directly out to the GPU. With the SDK, you can get accesses to both the host-side and device-side memory buffers within Jacket. With these pointers in hand, you can copy your data from disk or some socket — and avoid an intermediate MATLAB representation.

Below I’ve pasted the complete source code for a little function that pulls in several floating-point values from a file (in this case /dev/random) and stores that data directly into the internal Jacket host-side (CPU) buffer. This function returns a new Jacket GPU variable that can be used in subsequent computations. From inside MATLAB, our new function ‘my_read’ can be called like this:

>> A = my_read;  % pull in some random values
>> A + 1         % perform some computation

But keep in mind this is random garbage data it’s actually reading.

This approach could similarly be extended to read from sockets, pipes, memory-mapped hardware, etc. You could set it up to take in parameters indicating where to start reading, how much to read, etc.
-James Malcolm

Download the source and Makefile.

#include <jacket.h>
#include <stdio.h>

err_t jktFunction(int nlhs, mxArray *plhs[], int nrhs, mxArray *prhs[])
{
    char *filename = "/dev/random"; /* file to read from */
    off_t offset = 42;              /* starting position to read */
    size_t numel = 10;              /* how many values to read */

    FILE *f = fopen(filename, "rb");
    if (fseek(f, offset, SEEK_CUR) != 0)
        return err("couldn't fseek");

    /* note: assume single-precision */
    mxArray *m = plhs[0] = jkt_new(1, numel, mxSINGLE_CLASS, mxREAL);

    /* get host-side pointer */
    void *h_data;
    TRY(jkt_mem_host(&h_data, m));

    /* copy from disk directly into host-side buffer */
    if (fread(h_data, sizeof(float), numel, f) != numel)
        return err("problem reading");

    return errNone;
}

Here’s a Makefile to build it on Linux/Mac.

# 1) wherever you have CUDA
CUDADIR = /usr/local/cuda
# 2) wherever you have Matlab
MATLAB = /opt/matlab
# 3) wherever you have Jacket installed
JKT = /usr/local/jacket
# 4) uncomment if 64-bit
#  OS = 64

MEXT = $(shell $(MATLAB)/bin/mexext)
CUDAINC = -I$(CUDADIR)/include
CUDALIB = -L$(CUDADIR)/lib$(OS) -lcudart -Wl,-rpath,$(CUDADIR)/lib$(OS)
JKTLIB = -L$(JKT)/engine -ljacket$(OS)
JKTINC = -I$(JKT)/engine/include/
NVMEX = $(JKT)/sdk/common/nvmex -f $(JKT)/sdk/common/nvopts.sh

all: my_read.$(MEXT)

%.$(MEXT) : %.cu
	env MATLAB=$(MATLAB) $(NVMEX) $^ $(JKTLIB) $(JKTINC)
                                     $(CUDAINC) $(CUDALIB)
                                     -cxx -largeArrayDims

Comments on this entry are closed.

Previous post:

Next post: