<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>GPU MATLAB Computing</title>
	<atom:link href="http://blog.accelereyes.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.accelereyes.com/blog</link>
	<description>Fast MATLAB Code using GPUs with Jacket by AccelerEyes</description>
	<lastBuildDate>Tue, 07 Sep 2010 18:48:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Torben’s Corner &#8211; A GPU Computing Gem for Jacket Programmers!</title>
		<link>http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/</link>
		<comments>http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/#comments</comments>
		<pubDate>Tue, 07 Sep 2010 06:00:22 +0000</pubDate>
		<dc:creator>dgibson</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[documentation]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[matlab]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=732</guid>
		<description><![CDATA[In January, we introduced you to Torben’s Corner – a resource wiki created and maintained by Jacket programming guru, Torben Larsen at Aalborg University in Denmark.  Many Jacket programmers have gained valuable insights from Torben’s Corner, including GPU performance charts, coding guidelines, special tricks. Since January, many wonderful additions have been added to Torben’s Corner.  [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>In January, we <a href="../2010/01/19/torbens_corner/">introduced</a> you to <a href="http://wiki.accelereyes.com/wiki/index.php/Torben%27s_Corner" target="_blank">Torben’s Corner</a> – a resource wiki created and maintained by Jacket programming guru, Torben Larsen at Aalborg University in Denmark.  Many Jacket programmers have gained valuable insights from Torben’s Corner, including GPU performance charts, coding guidelines, special tricks.</p>
<p>Since January, many wonderful additions have been added to Torben’s Corner.  We think you will find value in not only this new information but the entire resource.  Here is a quick summary of the most recent additions with links to the information:</p>
<h3>Benchmarking Update</h3>
<p>Torben’s Corner maintains a long list of benchmarks specifically detailing speedups of Jacket relative to standard MATLAB. This became an enormous task due to the sheer number of functions supported by Jacket and MATLAB.</p>
<p>Therefore, a new benchmarking library has been designed. This library currently consists of 30 benchmarked functions and more will be added over time. These benchmarks measure the speedups of Jacket versus CPU-based MATLAB for matrix/vector and single/double precision. All results are gathered in three automatically generated outputs:</p>
<ol>
<li>a .mat file with all the measured data,</li>
<li>a .wki file which is a Wiki table ready to be inserted in a Wiki page, and</li>
<li>a Latex table with everything needed to produce a beautiful Latex table just by the press of a button.</li>
</ol>
<p>The library is freely available for anyone to use – with the standard requirements for open access software. You can read more and <a href="http://wiki.accelereyes.com/wiki/index.php/Torben%27s_Corner:_Jacket_Benchmark_Tables" target="_blank">download the benchmark</a>. The full set of benchmarks takes 1.5-2 hours to complete.</p>
<h3>Installing on Ubuntu</h3>
<p>A <a href="http://wiki.accelereyes.com/wiki/index.php/Installation_of_Ubuntu/CUDA/MATLAB/Jacket" target="_blank">guide</a> on how to install Jacket on Ubuntu, including installation instructions for Ubuntu, GPU Drivers, MATLAB, Jacket, and Dropbox.</p>
<h3>Running on a plain MATLAB install</h3>
<p>A <a href="http://wiki.accelereyes.com/wiki/index.php/Making_Jacket_Code_Run_On_Plain_MATLAB_Installations" target="_blank">procedure and library of functions</a> enabling Jacket programmers to write GPU code which will run with a plain CPU-only MATLAB installation. This is very important when programmers want to distribute their toolboxes, etc. Of course, without the GPU, the code will run slower, but with this procedure, your code will run without a GPU.</p>
<h3>CPU settings that affect performance</h3>
<p>A <a href="http://wiki.accelereyes.com/wiki/index.php/Influence_of_The_Performance_Setting_Of_The_CPU" target="_blank">small post</a> on how the CPU performance settings affect the floating point performance of the CPU and Jacket.</p>
<h3>Jacket performance study</h3>
<p>A <a href="http://wiki.accelereyes.com/wiki/index.php/Jacket_Floating_Point_Performance_%28GFlops%29" target="_blank">detailed study</a> on the floating point performance of Jacket. This is ongoing work and new results are continually being added.</p>
<p>&#8212;&#8212;&#8212;-</p>
<p>Thanks to Torben and his team for these contributions and hope all Jacket programmers can leverage this most recent work and the resource as a whole.</p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21+-+http://b2l.me/ap4fhm&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;t=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;t=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/&amp;title=Torben%E2%80%99s+Corner+-+A+GPU+Computing+Gem+for+Jacket+Programmers%21&amp;summary=In%20January%2C%20we%20introduced%20you%20to%20Torben%E2%80%99s%20Corner%20%E2%80%93%20a%20resource%20wiki%20created%20and%20maintained%20by%20Jacket%20programming%20guru%2C%20Torben%20Larsen%20at%20Aalborg%20University%20in%20Denmark.%C2%A0%20Many%20Jacket%20programmers%20have%20gained%20valuable%20insights%20from%20Torben%E2%80%99s%20Corner%2C%20including%20GPU%20performance%20charts%2C%20coding%20guideline&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/09/07/torbens_corner_gem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>GPU Giddy &#8211; Excitement Building for GTC</title>
		<link>http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/</link>
		<comments>http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/#comments</comments>
		<pubDate>Thu, 02 Sep 2010 07:05:06 +0000</pubDate>
		<dc:creator>melonakos</dc:creator>
				<category><![CDATA[CUDA]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[GTC]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[NVIDIA]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=621</guid>
		<description><![CDATA[GTC is coming up… The GPU Technology Conference (GTC) starts later this month and is sure to generate a new level of excitement and energy around GPU computing.  The conference includes over 250 technology sessions presented by industry, government, and academic technology leaders.  AccelerEyes is pleased to be well represented at this year’s conference by [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/09/GTC_vertical-right_376_small.jpg"><img class="alignright size-full wp-image-635" title="GTC_vertical-right_376_small" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/09/GTC_vertical-right_376_small.jpg" alt="" width="150" height="106" /></a>GTC is coming up…</p>
<p>The <a href="http://www.nvidia.com/object/gpu_technology_conference.html" target="_blank">GPU Technology Conference</a> (GTC) starts later this month and is sure to generate a new level of excitement and energy around GPU computing.  The conference includes over 250 technology sessions presented by industry, government, and academic technology leaders.  AccelerEyes is pleased to be well represented at this year’s conference by our technical leadership and a number of our customers.  If you plan to attend the conference be sure to include the sessions outlined below on your agenda.</p>
<p>In addition to being well represented, we are also flattered to see that others in the market have recognized that GPU Computing with MATLAB delivers clear productivity gains and that the performance improvements made possible by GPUs is a reality today.  Most notably, The MathWorks will share its vision and capabilities for GPU Computing with MATLAB during the conference, which should increase the visibility and demand for the technology worldwide.  We encourage everyone to attend the session to learn about their new offering.</p>
<p>AccelerEyes will be demoing Jacket at Table #56 and hope that you will stop by to see the latest and greatest Jacket technology during the conference.</p>
<h2>Jacketized GTC Sessions</h2>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c45de36165562e54d00005a"><strong>2132 &#8211; Accelerating Biologically Inspired Computer Vision Models </strong></a><strong> </strong></h3>
<p>Join us for a discussion on applying commodity-server-based clusters and GPU-based clusters to simulating computer vision algorithms at a scale that approaches that of biological vision. We consider the limitations of each technology, survey approaches taken thus far, and suggest new hybrid models and programming frameworks to overcome current limitations and substantially improve performance.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>Tom Dean, Google Inc.</td>
</tr>
<tr>
<td>Topic:</td>
<td>Computer Vision, Machine Learning &amp; Artificial Intelligence</td>
</tr>
<tr>
<td>Time:</td>
<td>Tuesday, September, 21st, 11:00 &#8211;   11:50</td>
</tr>
</tbody>
</table>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c5747bb1655628c7700003b"><strong>2268 &#8211; Think Data-Parallel! Building Data-Parallel Code with M </strong></a><strong> </strong></h3>
<p>Discover and leverage parallelism inherent in pre-existing codes. Often times, parallelism is hidden in seemingly serial programs. This is due obfuscation via indexing or looping wherein the parallelism is seemingly non-existent. Several real-world examples of seemingly serial code demonstrate simple, yet surprisingly effective rules for detecting potential parallelism.</p>
<p>For each example, learn how to express the code at a higher, more concise level in M by vectorizing computations. We give several canned techniques of vectorization for many common, and sometimes very difficult, use cases.</p>
<p>Learn how such vectorization concisely brings the parallelism of code to the forefront and transforms programs that might have been originally difficult to run on a SIMT device very suitable for execution on the GPU. GPU speedups will be shown utilizing Jacket.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>Gallagher Pryor, AccelerEyes</td>
</tr>
<tr>
<td>Topic:</td>
<td>General Interest</td>
</tr>
<tr>
<td>Time:</td>
<td>Tuesday, September, 21st, 15:30 &#8211;   15:50</td>
</tr>
</tbody>
</table>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c7be17b1655620eef0000cd"><strong>2300 &#8211; High-Performance Compressive Sensing using Jacket</strong></a><strong> </strong></h3>
<p>This talk will present the ongoing work that I am doing in the L1-optimization group at Rice University. The purpose of the work is to merge both compressive sensing, for image/signal reconstructions and GPU computation, using NVIDIA’s GPUs to enhance the technology of CS.</p>
<p>This talk will cover basic concepts in compressive sensing and the easy adaptation of operating on the GPU, in particular working with Jacket (by AccelerEyes). We will then cover some of our numerical experiments that encompass the use of different flavors of algorithms.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>Nabor Reyna</td>
</tr>
<tr>
<td>Topics:</td>
<td>Imaging, Tools &amp; Libraries</td>
</tr>
<tr>
<td>Time:</td>
<td>Wednesday, September, 22nd, 10:30   &#8211; 10:50</td>
</tr>
</tbody>
</table>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c4737ea1655627d9c000017"><strong>2201 &#8211; A Case Study of Accelerating Matlab Based Applications using GPUs</strong></a></h3>
<p>Learn how to accelerate Matlab based applications using GPUs. We cover a popular neuro-imaging software called SPM and show how to use CUDA and Jacket to speedup computationally intensive Matlab applications.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>Aniruddha Dasgupta, Georgia   Institute of Technology</td>
</tr>
<tr>
<td>Topic:</td>
<td>Medical Imaging &amp;   Visualization</td>
</tr>
<tr>
<td>Time:</td>
<td>Wednesday, September, 22nd, 16:00   &#8211; 16:50</td>
</tr>
</tbody>
</table>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c5c5beb16556295f100002e"><strong>2271 &#8211; Compose CUDA Masterpieces! Write better, Leverage More</strong></a></h3>
<p>Not all CUDA code is created equally. Learn how to step up your CUDA game. Also, learn how to build large, multi-person CUDA projects for your organization.</p>
<p>In very clear descriptions, learn the difference between naïve GPU code, intermediate GPU code, and advanced GPU mastery. We show how careful construction of CUDA kernels can affect application performance.</p>
<p>We also discuss how Jacket tools greatly facilitate the development of CUDA-based projects.</p>
<p>Finally, we will debut the Jacket runtime’s new C/C++ library. With this library, the technical computing functions in Jacket’s MATLAB engine are made available in C/C++.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>James Malcolm, AccelerEyes</td>
</tr>
<tr>
<td>Topic:</td>
<td>Tools &amp; Libraries</td>
</tr>
<tr>
<td>Time:</td>
<td>Thursday, September, 23rd, 16:00 &#8211;   16:50</td>
</tr>
</tbody>
</table>
<h3><a href="https://nvidia.confreg.com/gputechconference/schedule/by-session/4c45cceb165562e551000040"><strong>2100 &#8211; Hybrid GPU/Multicore Solutions for Large Linear Algebra Problems</strong></a></h3>
<p>Large linear algebra problems may be solved using recursive block decomposition in which GPUs efficiently compute the sub-blocks and multicore CPUs put the sub-blocks back together within a large shared memory space. This talk will present benchmark results for such a hybrid approach, implemented in Matlab® and using Jacket® to access the GPU compute power.</p>
<table border="0" cellpadding="0">
<tbody>
<tr>
<td>Speaker:</td>
<td>Nolan Davis, SAIC</td>
</tr>
<tr>
<td>Topics:</td>
<td>High Performance Computing,   Algorithms &amp; Numerical Techniques, Signal processing</td>
</tr>
<tr>
<td>Time:</td>
<td>Thursday, September, 23rd, 16:00 &#8211;   16:50</td>
</tr>
</tbody>
</table>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=GPU+Giddy+-+Excitement+Building+for+GTC+-+File: /data/app/webapp/functions.php<br />Line: 7<br />Message: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (11)&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;t=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;t=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/&amp;title=GPU+Giddy+-+Excitement+Building+for+GTC&amp;summary=GTC%20is%20coming%20up%E2%80%A6%0D%0A%0D%0AThe%20GPU%20Technology%20Conference%20%28GTC%29%20starts%20later%20this%20month%20and%20is%20sure%20to%20generate%20a%20new%20level%20of%20excitement%20and%20energy%20around%20GPU%20computing.%C2%A0%20The%20conference%20includes%20over%20250%20technology%20sessions%20presented%20by%20industry%2C%20government%2C%20and%20academic%20technology%20leaders.%C2%A0%20AccelerEy&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/09/02/gpu_giddy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tesla C2050 versus C1060 on Real MATLAB Applications</title>
		<link>http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/</link>
		<comments>http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/#comments</comments>
		<pubDate>Tue, 03 Aug 2010 14:45:46 +0000</pubDate>
		<dc:creator>melonakos</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[GPU Comparison]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[c1060]]></category>
		<category><![CDATA[c2050]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[fermi]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[sgemm]]></category>
		<category><![CDATA[tesla]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=562</guid>
		<description><![CDATA[Following our recent Jacket v1.4 Fermi architecture release, many of you requested data comparing the new NVIDIA Fermi-based Tesla C2050 versus the older Tesla C1060. Over the years, AccelerEyes has developed an extensive suite of benchmark MATLAB applications, which are included in every Jacket installation. Using this suite of tests, we compared performance of the [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Following our recent Jacket v1.4 Fermi architecture release, many of you requested data comparing the new NVIDIA Fermi-based Tesla C2050 versus the older Tesla C1060.</p>
<p>Over the years, AccelerEyes has developed an extensive suite of benchmark MATLAB applications, which are included in every Jacket installation. Using this suite of tests, we compared performance of the C2050 vs C1060 and are pleased to report the results here. We hope this information will be useful to Jacket programmers.</p>
<p>All tests were run on the same standard workstation with Jacket 1.4. The only thing that changed was the actual GPU board. In every case the C2050 beat the C1060. Double-precision examples on the Fermi-based board outperformed the older board by 50% in every case and better than 2x in many cases.</p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/2050v1060-700.jpg"><img class="aligncenter size-full wp-image-570" title="2050v1060-700" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/2050v1060-700.jpg" alt="" width="700" height="507" /></a></p>
<p style="text-align: center;"><em>Note: </em><em>ECC was enabled on the Fermi boards</em></p>
<p>In addition to the standard Jacket examples, matrix multiplication with SGeMM and DGeMM was performed and plotted in the following charts. This matrix multiply implementation was developed in-house at AccelerEyes and outperforms both CUBLAS and Magma considerably, see <a href="http://wiki.accelereyes.com/wiki/index.php/Mtimes_benchmarks">MTIMES benchmarks</a>. Special thanks to <a href="http://wiki.accelereyes.com/wiki/index.php/Torben%27s_Corner" target="_blank">Torben Larsen</a> for benchmarking results.</p>
<p style="text-align: center;"><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/plot_AccelerEyes_01.jpg"><img class="aligncenter size-full wp-image-580" title="Single Precision Floating Point" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/plot_AccelerEyes_01.jpg" alt="2050 vs 1060 floating point performance" width="800" height="600" /></a></p>
<p style="text-align: left;"><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/plot_DGeMM_AccelerEyes_01.jpg"><img class="aligncenter size-full wp-image-584" title="Double Precision Floating Point" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/08/plot_DGeMM_AccelerEyes_01.jpg" alt="2050 vs 1060 floating point performance" width="800" height="600" /></a><br />
As we generate or receive more comparison data we will communicate results.</p>
<p style="text-align: left;">


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications+-+http://b2l.me/ahr4zr&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;t=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;t=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/&amp;title=Tesla+C2050+versus+C1060+on+Real+MATLAB+Applications&amp;summary=Following%20our%20recent%20Jacket%20v1.4%20Fermi%20architecture%20release%2C%20many%20of%20you%20requested%20data%20comparing%20the%20new%20NVIDIA%20Fermi-based%20Tesla%20C2050%20versus%20the%20older%20Tesla%20C1060.%0D%0A%0D%0AOver%20the%20years%2C%20AccelerEyes%20has%20developed%20an%20extensive%20suite%20of%20benchmark%20MATLAB%20applications%2C%20which%20are%20included%20in%20every%20Jacket%20&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/08/03/tesla_c2050_versus_c1060_matlab_jacket/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Jacket for MATLAB now available for NVIDIA Fermi!</title>
		<link>http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/</link>
		<comments>http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 01:49:27 +0000</pubDate>
		<dc:creator>malcolm</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[fermi]]></category>
		<category><![CDATA[geforce]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[tesla]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=524</guid>
		<description><![CDATA[We are pleased to announce Jacket 1.4, with support for the latest NVIDIA graphics processing units based on the Fermi architecture (Tesla 20-series and GeForce GTX 4xx-series). NVIDIA&#8217;s release of the Fermi architecture brings with it 448 computational cores, increased IEEE-754 floating-point arithmetic precision, error-correcting memory for reliable computation, and enhanced memory caching mechanisms. Highlights [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>We are pleased to announce Jacket 1.4, with support for the latest NVIDIA graphics processing units based on the Fermi architecture (Tesla 20-series and GeForce GTX 4xx-series). NVIDIA&#8217;s release of the Fermi architecture brings with it 448 computational cores, increased IEEE-754 floating-point arithmetic precision, error-correcting memory for reliable computation, and enhanced memory caching mechanisms.</p>
<p>Highlights for Jacket 1.4 are as follows:</p>
<ul>
<li> Added support for the NVIDIA Fermi architecture (GTX400 and Tesla C2000 series)
<ul>
<li>Jacket DLA support for Fermi</li>
</ul>
</li>
<li>Dramatically improved the performance of Jacket&#8217;s JIT (Just-In-Time) compilation technology
<ul>
<li>Operations involving random scalar constants do not incur a recompile</li>
<li>Removed dependencies on MINGW and NVCC</li>
</ul>
</li>
<li>Logical indexing now supported for SUBSREF and SUBSASGN, e.g. B = A(A &gt; x)</li>
<li>MTIMES supports mixed types, no longer uses CUBLAS, and achieves better performance than CUBLAS</li>
<li>SUM, MIN, MAX, ANY, ALL now supported over any number of columns, rows, or dimensions</li>
<li>MIN, MAX indexed output now supported for complex single and complex double inputs</li>
<li>SUM, MIN, MAX over columns is greatly accelerated; vectors accelerated too</li>
<li>FIND performance improvements</li>
<li>CONVN, BLKDIAG, DOT performance improvements</li>
<li>CUMSUM now supported for matrices also</li>
<li>SORT, CONVN now supported in double-precision</li>
<li>HESS(A) and [P,H] = HESS(A) now supported (see Jacket DLA)</li>
<li>LEGENDRE now supported</li>
<li>Expanded GFOR support for:
<ul>
<li>MLDIVIDE, INV, HESS, MTIMES</li>
<li>FFT, FFT2, FFTN and inverses IFFT, IFFT2, IFFTN</li>
</ul>
</li>
<li>PCG now supported, this is a system solver that uses the Preconditioned Conjugate Gradient Method for dense matrices</li>
<li><a title="Image Processing Library" href="http://wiki.accelereyes.com/wiki/index.php/Image_Processing_Library">Image Processing Library</a> now  available. Direct access to the NVIDIA Performance Primitives (NPP)  enabling new image processing functionality such as <a title="NPPIERODE 8U C1R" href="http://wiki.accelereyes.com/wiki/index.php/NPPIERODE_8U_C1R">ERODE</a> and <a title="NPPIDILATE 8U C1R" href="http://wiki.accelereyes.com/wiki/index.php/NPPIDILATE_8U_C1R">DILATE</a>.</li>
</ul>
<p>The release notes are as follows:</p>
<p>See <a href="http://wiki.accelereyes.com/wiki/index.php/Release_Notes">http://wiki.accelereyes.com/wiki/index.php/Release_Notes</a> for full release notes.</p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21+-+http://b2l.me/ahr43y&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;t=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;t=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/&amp;title=Jacket+for+MATLAB+now+available+for+NVIDIA+Fermi%21&amp;summary=We%20are%20pleased%20to%20announce%20Jacket%201.4%2C%20with%20support%20for%20the%20latest%20NVIDIA%20graphics%20processing%20units%20based%20on%20the%20Fermi%20architecture%20%28Tesla%2020-series%20and%20GeForce%20GTX%204xx-series%29.%20NVIDIA%27s%20release%20of%20the%20Fermi%20architecture%20brings%20with%20it%20448%20computational%20cores%2C%20increased%20IEEE-754%20floating-point%20arith&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/07/14/jacket-matlab-nvidia-fermi/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>SGEMM, MTIMES &amp; CUBLAS performance on the GPU</title>
		<link>http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/</link>
		<comments>http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/#comments</comments>
		<pubDate>Thu, 24 Jun 2010 15:17:00 +0000</pubDate>
		<dc:creator>Chris McClanahan</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[cublas]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[mtimes]]></category>
		<category><![CDATA[sgemm]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=474</guid>
		<description><![CDATA[AccelerEyes is focused on not only providing the most easy to use GPU programming platform for CUDA capable GPUs by leveraging the MATLAB® language, our engineering organization is always looking for ways to improve the performance of all areas in the Jacket platform. A case in point is some recent work with matrix multiplication, specifically [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><strong>AccelerEyes</strong> is focused on not only providing the most easy to use GPU programming platform for CUDA capable GPUs by leveraging the MATLAB® language, our engineering organization is always looking for ways to improve the performance of all areas in the Jacket platform.  A case in point is some recent work with <em>matrix multiplication</em>, specifically (Single General Matrix Multiply) SGEMM, or <a title="MTIMES" href="http://wiki.accelereyes.com/wiki/index.php/MTIMES">MTIMES</a>.  The Jacket 1.3 release was based on CUBLAS for matrix multiplication and given the importance of matrix multiplication to so many of our customers, we decided to find out if we could improve performance of the function.</p>
<p><strong><em>Update: The new MTIMES routine in Jacket 1.4 has improved sigificantly since these benchmarks of the Release Candidate were taken. Have a look at the <a rel="nofollow" href="http://wiki.accelereyes.com/wiki/index.php/MTIMES_Benchmarks_(Jacket_1.4)">MTIMES Benchmarks</a> wiki for up-to-date performance results, including Fermi (Tesla C2050) benchmarks!</em></strong></p>
<p>The following chart illustrates the improvements we were able to make with our <em>custom </em>GEMM implementation in the 1.4 Release Candidate. The performance timings for SGEMM at various sizes of square matrices were performed, comparing the <em>Jacket 1.3 Release</em> and the<em> Jacket 1.4 Release Candidate</em>.</p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/13_14_mtimes_single_A_both_edit-e1277469888509.jpeg"><img class="alignnone size-full wp-image-514" title="1.3 1.4 mtimes single" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/13_14_mtimes_single_A_both_edit-e1277469888509.jpeg" alt="" width="683" height="580" /></a></p>
<p>The graph above represents <strong>C=A*B</strong> timings for <em>every NxN matrix from 10&#215;10 to 4000&#215;4000</em>. Each data point is an average of 2 calls of MTIMES for each matrix size.  Notice that Jacket 1.3 follows exactly 2 curves; the <em>lower</em> curve for certain multiples of 16 and 32, and the <em>upper</em> curve for everything else. The lower blue curve is attained by not using textures, as the data aligns and coalesces nicely in memory, and using other optimizations for certain data sizes.  By switching to using only texture memory, <em>Jacket 1.4</em> sacrifices some performance slightly in certain cases, but along with other optimizations, it brings <strong>faster consistent run-times</strong>, gives <strong>enhanced <a title="GFOR" href="http://wiki.accelereyes.com/wiki/index.php/GFOR">GFOR</a></strong><strong> performance</strong> and enables <strong>native mixed matrix types</strong> support. In addition to texture memory, careful shared memory and register usage brings significant performance enhancements as well.</p>
<div id="_mcePaste"><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/CPU_14_mtimes_single_A2_all.png"><img class="alignnone size-full wp-image-512" title="1.3 CPU 1.4 mtimes single" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/CPU_14_mtimes_single_A2_all.png" alt="" width="687" height="571" /></a></div>
<p>The above graph overlays a MATLAB R2009b 32-bit (CPU) benchmark of MTIMES.  <a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/13_14_mtimes_single_AG_all-e1277416726380.png"><img class="alignnone size-full wp-image-499" title="1-3 vs 1-4 vs CPU GFLOPS" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/13_14_mtimes_single_AG_all-e1277416726380.png" alt="" width="714" height="620" /></a></p>
<p>The graph above represents <strong>C=A*B</strong> GFLOPS for <em>every NxN matrix from 10&#215;10 to 4000&#215;4000</em>.</p>
<p>The<strong> GFLOPS formula</strong> used:  <a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/CodeCogsEqn.gif"><img class="aligncenter size-full wp-image-484" title="MTIMES GFLOPS formula" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/CodeCogsEqn.gif" alt="SGEMM GFLOPS formulat" width="156" height="42" /></a></p>
<p>Miscellaneous information regarding the Jacket installation and hardware used for testing:</p>
<pre lang="matlab">&gt;&gt; ginfo
AccelerEyes Jacket v1.4.0rc2 (build 4971)
CUDA driver: 195.36.15, CUDA toolkit 3.0
Memory: 0 CPU-used, 23 GPU-used, 4034 GPU-free (in MB)
License Type: Designated Computer
License Features: jacket sdk mgl4 dla
Multi-GPU: Licensed for 4 GPUs
Detected CUDA-capable GPUs:
GPU0 Tesla C1060, 1265 MHz, 4095 MB VRAM, Compute 1.3 (single,double) (in use)
GPU1 GeForce 8400 GS, 896 MHz, 511 MB VRAM, Compute 1.1 (single)</pre>
<pre lang="bash">$ cat /proc/cpuinfo
vendor_id      : GenuineIntel
model name    : Pentium(R) Dual-Core  CPU  E5200  @ 2.50GHz
cpu MHz        : 2499.934
cache size     : 2048 KB
cpu cores      : 2
…</pre>
<p><strong>Update</strong><br />
Below is an example of the SGEMM performance on a Tesla C2050 (Fermi) card<br />
-  Jacket 1.3 vs. Jacket 1.4 Final Release.<br />
<a href="http://wiki.accelereyes.com/wiki/images/8/8a/Jacket14_sgemm.jpg"><img src="http://wiki.accelereyes.com/wiki/images/8/8a/Jacket14_sgemm.jpg" alt="" width="700" height="580" /></a></p>
<p><a href="http://wiki.accelereyes.com/wiki/images/8/8a/Jacket14_sgemm.jpg"></a> <strong>References</strong><br />
* Volkov, V., and Demmel, J. W., <a title="Benchmarking GPUs to tune dense linear algebra" href="http://mc.stanford.edu/cgi-bin/images/6/65/SC08_Volkov_GPU.pdf">Benchmarking GPUs to tune dense linear algebra</a>, SC08.<br />
* <a title="SGEMM code" href="http://forums.nvidia.com/index.php?showtopic=47689">SGEMM code</a> from Vasily Volkov.<br />
* <a title="SGEMM code" href="http://forums.nvidia.com/index.php?showtopic=159033">SGEMM code</a> from Lung-Sheng Chien.<br />
* <em><strong>See the </strong></em><a title="MTIMES benchmarks" href="http://wiki.accelereyes.com/wiki/index.php/MTIMES_Benchmarks_(Jacket_1.4)"><em><strong>MTIMES benchmarks</strong></em></a><em><strong> wiki page for more info!</strong></em></p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU+-+http://b2l.me/ahr43z&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;t=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;t=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/&amp;title=SGEMM%2C+MTIMES+%26+CUBLAS+performance+on+the+GPU&amp;summary=AccelerEyes%20is%20focused%20on%20not%20only%20providing%20the%20most%20easy%20to%20use%20GPU%20programming%20platform%20for%20CUDA%20capable%20GPUs%20by%20leveraging%20the%20MATLAB%C2%AE%20language%2C%20our%20engineering%20organization%20is%20always%20looking%20for%20ways%20to%20improve%20the%20performance%20of%20all%20areas%20in%20the%20Jacket%20platform.%20%20A%20case%20in%20point%20is%20some%20recen&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/06/24/sgemm-mtimes/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>GCOMPILE &amp; GPROFILE: A Sneak Peek</title>
		<link>http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/</link>
		<comments>http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/#comments</comments>
		<pubDate>Fri, 18 Jun 2010 22:26:03 +0000</pubDate>
		<dc:creator>pulotfi</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[fermi]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[Jacket]]></category>
		<category><![CDATA[kmeans]]></category>
		<category><![CDATA[matlab]]></category>
		<category><![CDATA[tesla]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=366</guid>
		<description><![CDATA[The research and engineering teams at AccelerEyes have prepared some exciting new additions for Jacket. These additions will enable you to get even more leverage out of NVIDIA GPUs for computing in MATLAB.  Over the past few years we&#8217;ve had the pleasure of working along side scientists and engineers using Jacket, and have learned a [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>The research and engineering teams at AccelerEyes have prepared some exciting new additions for Jacket. These additions will enable you to get even more leverage out of NVIDIA GPUs for computing in MATLAB.  Over the past few years we&#8217;ve had the pleasure of working along side scientists and engineers using Jacket, and have learned a great deal about how people want to use MATLAB and GPUs. These new additions make significant progress towards enabling programmers to profile GPU computations and exert greater control over compilation and kernel execution inside Jacket.</p>
<p>In this post, we introduce GPROFILE and GCOMPILE for Jacket in anticipation of their upcoming release in the near future.  A brief overview of these two new functions follows.</p>
<p><strong>GPROFILE: One-of-a-kind Profiling and Tuning Tool for GPUs and MATLAB</strong><br />
Historically, Jacket&#8217;s optimizations have occurred behind-the-scenes. Feedback from programmers on the <a href="http://forums.accelereyes.com/" target="_blank">Jacket Forums</a> led us to dig in and develop a tool to enable users to monitor their scripts and see reports of runtime details <strong>that are unique to GPROFILE</strong>:</p>
<ul>
<li>timings between CPU and GPU for each GPU function,</li>
<li>differences between the results of CPU and GPU functions (for easy numerics and debugging),</li>
<li>unique timings per input size of CPU/GPU functions.</li>
</ul>
<p>This diagnostic output will be available in the immediate future as:</p>
<ul>
<li>HTML reports viewable in a browser,</li>
<li>Color coded reports on the MATLAB command line.</li>
</ul>
<p>These reports immediately identify the areas of code that are taking advantage of the GPU, how much benefit the GPU is giving, and also identify areas of code that are slow and need attention.  Coupled with the <a href="http://wiki.accelereyes.com/wiki/index.php/Tips" target="_blank">Jacket Tips</a> and <a href="http://wiki.accelereyes.com/wiki/index.php/Code_Vectorization_Examples" target="_blank">Code Vectorization</a> resources, GPROFILE reports add a new level of high-performance tuning for Jacket codes.</p>
<p><strong>GPROFILE Console Report</strong></p>
<p>Below is an example session of work with GPROFILE.  To profile Jacket code, the profiler is first enabled with the GPROFILE ON command (the GUI version, gprofview, will be presented in a later blog post).  Then, the code to be profiled is executed &#8211; here we are running a simple<a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/clustering-m.jpg" target="_blank"> k-means clustering script</a>. GPROFILE REPORT (the base reporting mechanism on the console) gives a simple overview of performance on a per-command basis, <img src="file:///C:/Users/GALLAG%7E1/AppData/Local/Temp/moz-screenshot.png" alt="" /></p>
<div id="attachment_441" class="wp-caption alignleft" style="width: 625px">
	<a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/screenshot2.png"><img class="size-full wp-image-443" title="screenshot" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/06/screenshot2.png" alt="" width="625" height="571" /></a>
	<p class="wp-caption-text">GPROFILE Console Report on a high end CPU (i950) versus a mid range GPU (280M)</p>
</div>
<p>Here, we see that, on average, all the calls to the functions SUBSREF, BSXFUN, TIMES, PERMUTE, EQ, and FIND on the GPU outperformed the CPU versions by varying factors as indicated by their green coloring.  Red lines indicate commands that, on average, performed worse on the GPU than the CPU; these are most likely caused by either minuscule timings (such as the case with SUBSASGN at the end of the list), odd memory arrangements (as in the case of MIN), or small data sizes (as in SUBSASGN on line 17 &#8211; the data size report is not shown for SUBSASGN).  In this case, GPROFILE is particularly useful as it identifies MIN as something that could be done differently to get the maximum work out of the GPU and in the future, GPROFILE will make suggestions on how to get the performance in such situations up.</p>
<p>Drilling down deeper by adding keywords (commands names, file names, and line numbers) to GPROFILE REPORT, information on a per-line basis is available and also on a <strong>per use case</strong> basis as shown on the last line of the screenshot.  Here, we see that TIMES was executed on two matrices of size 23 million by 5.  This information is very important for code development because data-parallel programs&#8217; speed and algorithms depend on and are very sensitive to data size, arrangement, and shape.  Being able to drill down to CPU vs GPU timings on a per-data-size and per-use-case basis is another <strong>unique feature</strong> of GPROFILE that allows programmers to optimize algorithms down to each of their use-cases or otherwise better-leverage their GPUs by maximizing the funnels of data making their way through code.</p>
<p>The final <strong>unique feature</strong> of GPROFILE is the identification of lazily evaluated functions or functions that are compiled directly to PTX by Jacket at run-time as shown in the last column of the GPROFILE REPORT output.  By maximizing the use of these functions and chaining these together as much as possible, memory latency is minimized in the GPU code that Jacket runs.</p>
<h2><strong>GCOMPILE</strong></h2>
<p>GCOMPILE is a new feature that allows MATLAB developers to pre-compile sections of performance critical code.  After isolating a critical section, developers format the code and pass it through GCOMPILE for analysis.  The compiled function can then be applied with less overhead.</p>
<p>Let&#8217;s jump into an example.  At present, one of the key missing features of GFOR are if-statements.  With GCOMPILE, Jacket is able to handle such statements.  For example, suppose you wanted to threshold an image and needed to use an if-statement.  You need to use something like <a href="http://www.mathworks.com/matlabcentral/fileexchange/23194">verbatim.m</a> to push your critical function into a string for the compiler.</p>
<pre lang="matlab">code = verbatim;
%{
  function out = main(in, threshold)
    if in &gt; threshold
      out = threshold / 5;
    else
      out = sin(in);
    end
  end
%}
% compile and get function handle for later use
fn = gcompile(code);
thresh = 42;
for i = 1:n
  img = volume(:,:,i);
  out(:,:,i) = fn(img, thresh);  % apply the function to each image slice
end</pre>
<p>The string code sample is essentially applied element-wise.  In the <code>main()</code> function, the variable <code>in</code> contains a scalar element pulled from <code>img</code> while <code>threshold</code> is the same MATLAB scalar 42;</p>
<p>Another example kernel might look like this:</p>
<pre lang="matlab">code = verbatim;
%{
function D = code(A, B, C)
D = A * C + cos(B);  % all element-wise
end
%}</pre>
<p>Jacket v1.4 release candidates are rolling out with new on-the-fly compilation technology, GCOMPILE is the first of many extensions building on that technology.  Please contact <a href="mailto:support@accelereyes.com">support</a> if you would like to be included in the pre-release testing.  GCOMPILE is scheduled for inclusion in the first release candidate after v1.4.  The initial version only supports a subset of the M language.</p>
<p>Here are some other planned features, but be sure to tell us what you want in the comments section or on our <a href="http://forums.accelereyes.com">forums</a>.</p>
<ul>
<li>Read the functions from disk.</li>
<li>Nested functions</li>
<li>Reductions (e.g. sum/min/max/..)</li>
<li>For-loops, subscripting, etc.</li>
</ul>
<p>~Puyan Lotfi, ~Brett Lucey (gcompile)<br />
~Gallagher Pryor, ~Joe Uhl (gprofile)</p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek+-+http://b2l.me/ahr432&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;t=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;t=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/&amp;title=GCOMPILE+%26+GPROFILE%3A+A+Sneak+Peek&amp;summary=The%20research%20and%20engineering%20teams%20at%20AccelerEyes%20have%20prepared%20some%20exciting%20new%20additions%20for%20Jacket.%20These%20additions%20will%20enable%20you%20to%20get%20even%20more%20leverage%20out%20of%20NVIDIA%20GPUs%20for%20computing%20in%20MATLAB.%C2%A0%20Over%20the%20past%20few%20years%20we%27ve%20had%20the%20pleasure%20of%20working%20along%20side%20scientists%20and%20engineer&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/06/18/profile-compile-matlab-gpu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Jacket accelerating life science and defense applications</title>
		<link>http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/</link>
		<comments>http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/#comments</comments>
		<pubDate>Fri, 28 May 2010 19:34:48 +0000</pubDate>
		<dc:creator>dgibson</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[Success Stories]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[matlab]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=356</guid>
		<description><![CDATA[With IBM’s decision this week to integrate Tesla technology into it’s high performance computing line, there should be no doubt that GP-GPU computing is more than a fad, organizations solving technical problems are able to do them more productively and efficiently than ever before with GPUs.  AccelerEyes’ customers are experiencing this first hand with the [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>With <a href="http://blogs.nvidia.com/ntersect/2010/05/ibm-embraces-gpus-for-high-performance-computing.html" target="_blank">IBM’s decision</a> this week to integrate Tesla technology into it’s high performance computing line, there should be no doubt that GP-GPU computing is more than a fad, organizations solving technical problems are able to do them more productively and efficiently than ever before with GPUs.  AccelerEyes’ customers are experiencing this first hand with the Jacket product family as they are able to quickly and easily implement new or existing algorithms for GPUs and accomplish their technical needs much faster with substantial speed improvements.</p>
<p>Case in point, this week, AccelerEyes has released two case studies from customers that have used Jacket to transform their applications to GPU Computing with compelling results.</p>
<blockquote><p>System Planning Corporation has implemented two different radar processing applications for the GPU with Jacket that have provided the insight they needed to fully understand how GPUs can benefit their customer requirements.  Learn more about these applications:</p>
<p><a href="http://www.accelereyes.com/resources/radarprocessing">http://www.accelereyes.com/resources/radarprocessing</a></p>
<p><a href="http://www.accelereyes.com/resources/radarformation">http://www.accelereyes.com/resources/radarformation</a></p></blockquote>
<blockquote><p>The <a href="http://spectraldiagnosis.com/LabSpecDiag.html" target="_blank">Laboratory for Spectral Diagnosis</a> at Northeastern University, under the leadership of Professor Max Diem, has a research focus on spectral diagnosis of disease and have leverage Jacket and GPUs to accelerate their research while also accelerating the time it takes to identify cancer and other illnesses.  Learn more at <a href="http://www.accelereyes.com/resources/spectroscopy">http://www.accelereyes.com/resources/spectroscopy</a></p></blockquote>
<p>Jacket and NVIDIA GPUs continue to address and accelerate a growing list of science, engineering and analytical applications.  Additional case studies of application using Jacket can be found at <a href="http://www.accelereyes.com/successstories">http://www.accelereyes.com/successstories</a></p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Jacket+accelerating+life+science+and+defense+applications+-+http://b2l.me/ahr433&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;t=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;t=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/&amp;title=Jacket+accelerating+life+science+and+defense+applications&amp;summary=With%20IBM%E2%80%99s%20decision%20this%20week%20to%20integrate%20Tesla%20technology%20into%20it%E2%80%99s%20high%20performance%20computing%20line%2C%20there%20should%20be%20no%20doubt%20that%20GP-GPU%20computing%20is%20more%20than%20a%20fad%2C%20organizations%20solving%20technical%20problems%20are%20able%20to%20do%20them%20more%20productively%20and%20efficiently%20than%20ever%20before%20with%20GPUs.%C2%A0%20A&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/05/28/jacket-bioscience-defense-gpgpu/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Rapid Application Development platform for GPGPUs &#8211; Jacket with MATLAB®</title>
		<link>http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/</link>
		<comments>http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/#comments</comments>
		<pubDate>Sun, 23 May 2010 21:37:46 +0000</pubDate>
		<dc:creator>dgibson</dc:creator>
				<category><![CDATA[CUDA]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[event]]></category>
		<category><![CDATA[matlab]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=350</guid>
		<description><![CDATA[If you’re a MATLAB user and want to apply your applications to NVIDIA GPUs for performance improvement but don&#8217;t want to write C, C++, or CUDA code, attend this seminar to learn more about Jacket for MATLAB &#8211; http://www.accelereyes.com/resources/junewebinar Tweet This! Share this on Facebook Submit this to Hacker News Digg this! Share this on [...]]]></description>
			<content:encoded><![CDATA[<p></p><p><span style="color: #000000;">If you’re a MATLAB user and want to apply your  applications to NVIDIA GPUs for performance improvement but don&#8217;t want  to write C, C++, or CUDA code, attend this seminar to learn more about  Jacket for MATLAB &#8211; <a onmousedown="UntrustedLink.bootstrap($(this), &quot;4ab55&quot;, event);" rel="nofollow" href="http://www.accelereyes.com/resources/junewebinar" target="_blank">http://www.accelereyes.com/resources/junewebinar</a></span></p>
<p><span style="color: #000000;"><br />
</span></p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE+-+http://b2l.me/ahr434&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;t=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;t=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/&amp;title=Rapid+Application+Development+platform+for+GPGPUs+-+Jacket+with+MATLAB%C2%AE&amp;summary=If%20you%E2%80%99re%20a%20MATLAB%20user%20and%20want%20to%20apply%20your%20%20applications%20to%20NVIDIA%20GPUs%20for%20performance%20improvement%20but%20don%27t%20want%20%20to%20write%20C%2C%20C%2B%2B%2C%20or%20CUDA%20code%2C%20attend%20this%20seminar%20to%20learn%20more%20about%20%20Jacket%20for%20MATLAB%20-%20http%3A%2F%2Fwww.accelereyes.com%2Fresources%2Fjunewebinar%0D%0A%0D%0A%0D%0A&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/05/23/rapid-application-development-for-gpgpus/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NVIDIA Fermi with CUDA and OpenCL</title>
		<link>http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/</link>
		<comments>http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/#comments</comments>
		<pubDate>Mon, 10 May 2010 19:34:39 +0000</pubDate>
		<dc:creator>pavan</dc:creator>
				<category><![CDATA[Benchmarks]]></category>
		<category><![CDATA[CUDA]]></category>
		<category><![CDATA[OpenCL]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[benchmarks]]></category>
		<category><![CDATA[cuda]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=327</guid>
		<description><![CDATA[In December of 2008, we did a blog post answering questions from customers and prospects about the use of OpenCL for Jacket.  If you have not reviewed that blog post to gain some insight into our progress you can access it here &#8211; http://blog.accelereyes.com/blog/2008/12/30/opencl/. Some things have changed since that original post.  For example, NVIDIA [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>In December of 2008, we did a blog post answering questions from customers and prospects about the use of OpenCL for Jacket.  If you have not reviewed that blog post to gain some insight into our progress you can access it here &#8211; <a href="../2008/12/30/opencl/">http://blog.accelereyes.com/blog/2008/12/30/opencl/</a>.</p>
<p>Some things have changed since that original post.  For example, NVIDIA now provides an OpenCL driver, toolkit, programming guide, and SDK examples.  Given the new tools available and the new Fermi hardware, we ran some tests on the <a href="http://www.nvidia.com/object/product_tesla_C2050_C2070_us.html" target="_blank">Tesla c2050</a> to compare OpenCL performance to CUDA performance.  The Tesla C2050 is an amazing beast of a card, providing upto 512 Gigaflops of double precision arithmetic (at peak).</p>
<p>Before we present the benchmarks, we should comment on the programmability of OpenCL versus CUDA.  OpenCL is notably more difficult to program and debug than CUDA since OpenCL documentation, tools, and scientific   computation libraries are still very limited.  Considering these handicaps, only a few matrix / vector operations were considered for this benchmark.  All the vector operations are modified versions of the SDK examples provided by NVIDIA.  All the tests were for single precision numbers.</p>
<p>Here are the results:</p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/MatrixMultiply.png"><img class="alignnone size-large wp-image-331" title="MatrixMultiply" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/MatrixMultiply-683x1024.png" alt="MatrixMultipleFermi" width="546" height="819" /></a></p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/DotProduct.jpg"><img class="alignnone size-large wp-image-333" title="DotProduct" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/DotProduct-660x1024.jpg" alt="" width="528" height="819" /></a></p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/Transpose.jpg"><img class="alignnone size-large wp-image-335" title="Transpose" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/Transpose-683x1024.jpg" alt="" width="546" height="819" /></a></p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/VectorAdd.jpg"><img class="alignnone size-large wp-image-336" title="VectorAdd" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/VectorAdd-660x1024.jpg" alt="VectorAddFermiCUDAOpenCL" width="528" height="819" /></a></p>
<p><a href="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/reductions.jpg"><img class="alignnone size-large wp-image-337" title="reductions" src="http://blog.accelereyes.com/blog/wp-content/uploads/2010/05/reductions-1024x739.jpg" alt="reductionsfermicudaopencl" width="737" height="532" /></a></p>
<p>The results indicate that there is an overhead when using OpenCL with smaller data sizes, which seems to disappear at larger data sizes. Currently it is unknown whether the overhead is due to the time taken to launch a kernel in OpenCL or something else within the API.</p>
<p>This is our report of the current status of OpenCL relative to CUDA on the new NVIDIA hardware.  We continue to watch the progress of OpenCL, ATI, and other GPU  computing initiatives.  Our focus is to deliver the best GPU computing  platform on the planet for engineers, scientists, analysts, and  students, and we guarantee that  <a href="http://www.accelereyes.com/products/jacket">Jacket</a> customers will always have the very best in GPU hardware choices  for your applications.  As the GPU landscape continues to evolve, your Jacket code will simply get faster without you having to do anything new.  So, we invite you to sit back, relax, and enjoy watching your Jacket code scale with each new release!</p>
<p><strong>Useful links:</strong></p>
<p>1)  <a href="http://www.anandtech.com/show/2977/nvidia-s-geforce-gtx-480-and-gtx-470-6-months-late-was-it-worth-the-wait-/6">http://www.anandtech.com/show/2977/nvidia-s-geforce-gtx-480-and-gtx-470-6-months-late-was-it-worth-the-wait-/6</a></p>
<p>2)  <a href="http://www.pcgameshardware.com/aid,743498/Geforce-GTX-480-and-GTX-470-reviewed-Fermi-performance-benchmarks/Reviews/">http://www.pcgameshardware.com/aid,743498/Geforce-GTX-480-and-GTX-470-reviewed-Fermi-performance-benchmarks/Reviews/</a></p>
<p>3) <a href="http://www.appleinsider.com/articles/08/12/10/nvidia_pioneering_opencl_support_on_top_of_cuda.html">http://www.appleinsider.com/articles/08/12/10/nvidia_pioneering_opencl_support_on_top_of_cuda.html</a></p>
<p>4)  <a href="http://www.sisoftware.co.uk/?d=qa&amp;f=gpu_opencl&amp;a=AMD">http://www.sisoftware.co.uk/?d=qa&amp;f=gpu_opencl&amp;a=AMD</a></p>
<p>5)  <a href="http://unigine.blogspot.com/2010/02/cuda-vs-opencl-vs-directcompute.html">http://unigine.blogspot.com/2010/02/cuda-vs-opencl-vs-directcompute.html</a></p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=NVIDIA+Fermi+with+CUDA+and+OpenCL+-+http://b2l.me/ahr435&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;t=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;t=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/&amp;title=NVIDIA+Fermi+with+CUDA+and+OpenCL&amp;summary=In%20December%20of%202008%2C%20we%20did%20a%20blog%20post%20answering%20questions%20from%20customers%20and%20prospects%20about%20the%20use%20of%20OpenCL%20for%20Jacket.%C2%A0%20If%20you%20have%20not%20reviewed%20that%20blog%20post%20to%20gain%20some%20insight%20into%20our%20progress%20you%20can%20access%20it%20here%20-%20http%3A%2F%2Fblog.accelereyes.com%2Fblog%2F2008%2F12%2F30%2Fopencl%2F.%0D%0A%0D%0ASome%20things%20h&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/05/10/nvidia-fermi-cuda-and-opencl/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Vectorizing MATLAB Code for GPU Computing</title>
		<link>http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/</link>
		<comments>http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/#comments</comments>
		<pubDate>Wed, 05 May 2010 16:04:37 +0000</pubDate>
		<dc:creator>gallagher.pryor</dc:creator>
				<category><![CDATA[MATLAB]]></category>
		<category><![CDATA[Parallel computing]]></category>
		<category><![CDATA[GPGPU]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[vectorization]]></category>

		<guid isPermaLink="false">http://blog.accelereyes.com/blog/?p=310</guid>
		<description><![CDATA[Over the last couple of months we have participated in some pretty amazing stories about dramatic speed ups for customer M-codes. For example, one customer went from 400 minutes of run time to 20 seconds and then further optimization dropped their runtime from 20 seconds to 65 milliseconds! This signifies a greater than 1000x performance [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Over the last couple of months we have participated in some pretty amazing stories about dramatic speed ups for customer M-codes. For example, one customer went from 400 minutes of run time to 20 seconds and then further optimization dropped their runtime from 20 seconds to 65 milliseconds! This signifies a greater than <strong>1000x performance improvement</strong> from the original code. Certainly, a runtime difference like that can make the difference in solving a problem versus not even attempting the effort.</p>
<p>When performance is critical to your problem, software platforms like Jacket can help to leverage hardware like GPUs, but additionally, vectorization of MATLAB code can and will make a huge difference in the runtimes of your scripts whether operating in single threaded, multi-threaded, CPU cluster or GPU computing mode.  Vectorization may take time and effort but when performance is really important it is more than likely worth the effort.</p>
<p>MATLAB, parallel computing, and GPU computing all perform best on vectorized code. They all take advantage of the inherent parallelism of the M-language which is extremely powerful when utilized wisely. There are numerous sources of information available on the internet to learn about vectorization and to obtain vectorization examples,</p>
<blockquote><p><strong>Mathworks &#8211; Code Vectorization Guide</strong><br />
<a href="http://www.mathworks.com/support/tech-notes/1100/1109.shtml" target="_blank">http://www.mathworks.com/support/tech-notes/1100/1109.shtml</a><br />
<strong>MATLAB Tutorial at Cyclismo</strong><br />
<a href="http://www.cyclismo.org/tutorial/matlab/vector.html" target="_blank">http://www.cyclismo.org/tutorial/matlab/vector.html</a><br />
<strong>Improving the Speed of MATLAB Calculations &#8211; Portland State University</strong><br />
<a href="http://web.cecs.pdx.edu/~gerry/MATLAB/programming/performance.html" target="_blank">http://web.cecs.pdx.edu/~gerry/MATLAB/programming/performance.html</a><br />
<strong><em><span style="color: #189302;">m</span></em>atlab <span style="color: #189302;"><em> t</em></span>ips and <span style="color: #189302;"><em>t</em></span>ricks  and <em><span style="color: #189302;">&#8230;</span></em></strong><br />
<a href="http://www.ee.columbia.edu/~marios/matlab/matlab_tricks.html" target="_blank">http://www.ee.columbia.edu/~marios/matlab/matlab_tricks.html</a></p></blockquote>
<p>In addition to these generally available resources to help in vectorizing M-code, AccelerEyes has begun (just begun) to build a library of vectorization examples that we are confident will help Jacket users dramatically improve their code.  We will continue to build on this library of examples so bookmark the wiki page and visit often.</p>
<p><a href="http://wiki.accelereyes.com/wiki/index.php/Code_Vectorization_Examples ">http://wiki.accelereyes.com/wiki/index.php/Code_Vectorization_Examples</a></p>
<p>AccelerEyes also plans to put together training materials and tutorials to help programmers learn the art of what we&#8217;re calling <strong>hardware independent data-parallel programming</strong>. The overall idea is that whereas in the past, programmers were required to write low-level code (assembly, CUDA, etc) to realize performance, today the same gains can be made by instead writing at a higher level (vectorized for instance) with the added plus that in doing so, coders aren&#8217;t tying their applications down to a particular piece of hardware.  Stay tuned as we will post when these materials become available.</p>
<p>Furthermore, we would encourage you to participate in our <a href="http://forums.accelereyes.com/forums/">User Forum</a> and provide segments of your code where you are trying to improve performance – <strong>particularly for-loops</strong>.  Our team along with other Jacket users may be able to provide you with some vectorization assistance to help not only your CPU performance but what you can get from the GPU leveraging Jacket.</p>


<div class="shr-bookmarks shr-bookmarks-expand shr-bookmarks-center shr-bookmarks-bg-knowledge">
<ul class="socials">
		<li class="shr-twitter">
			<a href="http://twitter.com/home?status=Vectorizing+MATLAB+Code+for+GPU+Computing+-+http://b2l.me/ahr437&amp;source=shareaholic" rel="nofollow" class="external" title="Tweet This!">Tweet This!</a>
		</li>
		<li class="shr-facebook">
			<a href="http://www.facebook.com/share.php?v=4&amp;src=bm&amp;u=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;t=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Share this on Facebook">Share this on Facebook</a>
		</li>
		<li class="shr-hackernews">
			<a href="http://news.ycombinator.com/submitlink?u=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;t=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Submit this to Hacker News">Submit this to Hacker News</a>
		</li>
		<li class="shr-digg">
			<a href="http://digg.com/submit?phase=2&amp;url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Digg this!">Digg this!</a>
		</li>
		<li class="shr-reddit">
			<a href="http://reddit.com/submit?url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Share this on Reddit">Share this on Reddit</a>
		</li>
		<li class="shr-slashdot">
			<a href="http://slashdot.org/bookmark.pl?url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Submit this to SlashDot">Submit this to SlashDot</a>
		</li>
		<li class="shr-googlebuzz">
			<a href="http://www.google.com/buzz/post?url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;imageurl=" rel="nofollow" class="external" title="Post on Google Buzz">Post on Google Buzz</a>
		</li>
		<li class="shr-delicious">
			<a href="http://delicious.com/post?url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Share this on del.icio.us">Share this on del.icio.us</a>
		</li>
		<li class="shr-stumbleupon">
			<a href="http://www.stumbleupon.com/submit?url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing" rel="nofollow" class="external" title="Stumble upon something good? Share it on StumbleUpon">Stumble upon something good? Share it on StumbleUpon</a>
		</li>
		<li class="shr-linkedin">
			<a href="http://www.linkedin.com/shareArticle?mini=true&amp;url=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/&amp;title=Vectorizing+MATLAB+Code+for+GPU+Computing&amp;summary=Over%20the%20last%20couple%20of%20months%20we%20have%20participated%20in%20some%20pretty%20amazing%20stories%20about%20dramatic%20speed%20ups%20for%20customer%20M-codes.%20For%20example%2C%20one%20customer%20went%20from%20400%20minutes%20of%20run%20time%20to%2020%20seconds%20and%20then%20further%20optimization%20dropped%20their%20runtime%20from%2020%20seconds%20to%2065%20milliseconds%21%20This%20sig&amp;source=GPU MATLAB Computing" rel="nofollow" class="external" title="Share this on LinkedIn">Share this on LinkedIn</a>
		</li>
		<li class="shr-technorati">
			<a href="http://technorati.com/faves?add=http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/" rel="nofollow" class="external" title="Share this on Technorati">Share this on Technorati</a>
		</li>
</ul>
<div style="clear:both;"></div>
</div>

]]></content:encoded>
			<wfw:commentRss>http://blog.accelereyes.com/blog/2010/05/05/vectorizing-matlab-gpu-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
