So if your GPU is capable of running OpenCL code then the CU2CL project might be of your interest.Review for NeurIPS paper: Accelerating Reinforcement Learning through GPU Atari Emulation NeurIPS 2020 Accelerating Reinforcement Learning through GPU Atari Emulation It seems to be able to translate CUDA code to OpenCL code. Here's the link to the project's website: Īs dashesy pointed out in the comments, CU2CL seems to be an interesting project. It doesn't seem to be developed anymore (the last commit is dated on Jul 4, 2013). It is an emulator to use on Windows 7 and 8. The MCUDA translation framework is a linux-based tool designed toĮffectively compile the CUDA programming model to a CPU architecture. You might try to follow this tutorial from July, 2015 but I don't guarantee it'll work. gpuocelot is no longer supported and depends on a set of very specific versions of libraries and software. I had several errors during installation though and I gave up again.
I tried to install gpuocelot following the guide. Actually, it was abandoned few years later. So at first I thought that the project was abandoned in 2012 or so. The answer by Stringer has a link to a very old gpuocelot project website. I've found on the Internet that if I used gcc-4.2 or similarly ancient instead of gcc-4.9.2 the errors might disappear. home/user/Downloads/helloworld.cu(12): error: identifier "cudaDeviceSynchronize" is undefinedģ errors detected in the compilation of "/tmp/tmpxft_000011c2_00000000-4_". usr/include/i386-linux-gnu/bits/byteswap.h(111): error: identifier "_builtin_bswap64" is undefined usr/include/i386-linux-gnu/bits/byteswap.h(47): error: identifier "_builtin_bswap32" is undefined It turned out that I had difficulties with compiling it: NOTE: device emulation mode is deprecated in this release Note that in CUDA Toolkit 3.0 nvcc was in the /usr/local/cuda/bin/. ThreadIdx.x, threadIdx.x / warpSize, blockIdx.x) Printf("Hello world! I am %d (Warp %d) from %d.\n", I downloaded CUDA Toolkit 3.0, installed it and tried to run a simple nvcc used to have a -deviceemu option back in CUDA Toolkit 3.0.It might be possible to use gpuocelot if you satisfy its list of.If anyone has any time and interest, they may wish to help us provide support for Windows!įor those who are seeking the answer in 2016 (and even 2017).
We've started a Windows branch ( as well as a Mac OS X port) but the engineering burden is already large enough to stress our research pursuits. These tools were specifically developed to expedite the debugging of CUDA programs you may find them useful. We have also implemented a command-line interactive debugger inspired largely by gdb to single-step through CUDA kernels, set breakpoints and watchpoints, etc. deviceemu has been a deprecated feature of CUDA for quite some time, but the LLVM translator has always been faster.Īdditionally, several correctness checkers are built into the emulator to verify: aligned memory accesses, accesses to shared memory are properly synchronized, and global memory dereferencing accesses allocated regions of memory. The LLVM translator strives for correct and efficient translation from PTX to x86 that will hopefully make CUDA an effective way of programming multicore CPUs as well as GPUs. The emulator attempts to faithfully implement the PTX 1.4 and PTX 2.1 specifications which may include features older GPUs do not support. I've demonstrated the emulator on systems without NVIDIA GPUs.
GPU Ocelot ( of which I am one of the core contributors) can be compiled without CUDA device drivers (libcuda.so) installed if you wish to use the Emulator or LLVM backends. This response may be too late, but it's worth noting anyway.