numpy.dot causes segfault after ctypes call to cudaMalloc
I'm learning cuda and decided to use python with ctypes to call all the cuda functions but I'm finding some memory issues. I've boiled it down to the simplest scenario. I use ctypes to call a cuda function which allocates memory on the device and then frees it. This works fine, but if I then try to use np.dot on a totally other array declared in python, I get a segmentation fault. Note this only happens if the numpy array is sufficiently large. If I change the cuda mallocs to simple c mallocs, all the problems go away, but thats not really helpful. Any ideas what's going on here? CUDA CODE (debug.cu): #include <stdio.h> #include <stdlib.h> extern "C" { void all_together( size_t N) { void*d; int size = N *sizeof(float); int err; err = cudaMalloc(&d, size); if (err != 0) printf("cuda malloc error: %d\n", err); err = cudaFree(d); if (err != 0) printf("cuda free error: %d\n", err); }} PYTHON CODE (master.py): import numpy as np import ctypes from ctypes import * dll = ctypes.CDLL('./cuda_lib.so', mode=ctypes.RTLD_GLOBAL) def build_all_together_f(dll): func = dll.all_together func.argtypes = [c_size_t] return func __pycu_all_together = build_all_together_f(dll) if __name__ == '__main__': N = 5001 # if this is less, the error doesn't show up a = np.random.randn(N).astype('float32') da = __pycu_all_together(N) # toggle this line on/off to get error #np.dot(a, a) print 'end of python' COMPILE: nvcc -Xcompiler -fPIC -shared -o cuda_lib.so debug.cu RUN: python master.py -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/numpy-dot-causes-segfault-after-... Sent from the Numpy-discussion mailing list archive at Nabble.com.
participants (1)
-
ebuchman