Skip to main content

Posts

Showing posts from 2015

OpenCL - Part 2

CL_MEM_COPY_HOST_PTR: Copies memory from the host to the device. In my case it was taking about 15ms. Instead I use a zero buffer operation which saves that time. Also, creating buffers with zero buffer operations makes a big difference in terms of performance. CL_MEM_USE_HOST_PTR: In my case, this is a preferred choice as the memory will use the memory referenced by the host as the storage. After some reading, this would be a better option when using GPU as memory will be allocated in the Pinned memory. As a result, we get the following values (compared to the previous post) 1. Creating Buffers: From 6ms to 9microSec 2. Writing Buffers: From 15ms to 6ms 3. Reading results: 6ms (enqueueReadBuffer). This is still an issue. That's already impressive. in the end, the result varies from 10 to 14ms (compared to the previous 32ms). Doubling the amount of data keeps the same ratio between both versions, so there's still no advantage on the parallel version. A further

OpenCL - Part 1

As the title suggests, I've been interested lately in parallel computing, the possible applications, and how can you explode the processing potential of a computer. There are few options out there in regards to programming languages/libraries to allow programmers developing their tools and applications using a parallel programming model. As I have a computer equipped with an AMD video card, my natural choice was OpenCL. After struggling with the video card drivers, and giving up to ignore the invitation of Microsoft to upgrade my OS to windows 10 (which by the way is better than I expected), I was able to install the [ AMD sdk ] and run the hello world program. You know, if Hello World compiles, and displays (somehow) a  hello world  message on your screen, you are blessed. Now, what can I do with so much power? Well, first thing came to my mind was to develop an algorithm which can run in parallel, that is simple to code, and can take advantage of this programming model.