I'm learning cuda and decided to use python with ctypes to call all the cuda
functions but I'm finding some memory issues. I've boiled it down to the
simplest scenario. I use ctypes to call a cuda function which allocates
memory on the device and then frees it. This works fine, but if I then try
to use np.dot on a totally other array declared in python, I get a
segmentation fault. Note this only happens if the numpy array is
sufficiently large. If I change the cuda mallocs to simple c mallocs, all
the problems go away, but thats not really helpful. Any ideas what's going
on here?
CUDA CODE (debug.cu):
#include
#include
extern "C" {
void all_together( size_t N)
{
void*d;
int size = N *sizeof(float);
int err;
err = cudaMalloc(&d, size);
if (err != 0) printf("cuda malloc error: %d\n", err);
err = cudaFree(d);
if (err != 0) printf("cuda free error: %d\n", err);
}}
PYTHON CODE (master.py):
import numpy as np
import ctypes
from ctypes import *
dll = ctypes.CDLL('./cuda_lib.so', mode=ctypes.RTLD_GLOBAL)
def build_all_together_f(dll):
func = dll.all_together
func.argtypes = [c_size_t]
return func
__pycu_all_together = build_all_together_f(dll)
if __name__ == '__main__':
N = 5001 # if this is less, the error doesn't show up
a = np.random.randn(N).astype('float32')
da = __pycu_all_together(N)
# toggle this line on/off to get error
#np.dot(a, a)
print 'end of python'
COMPILE: nvcc -Xcompiler -fPIC -shared -o cuda_lib.so debug.cu
RUN: python master.py
--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/numpy-dot-causes-segfault-after-...
Sent from the Numpy-discussion mailing list archive at Nabble.com.