
I'm learning cuda and decided to use python with ctypes to call all the cuda functions but I'm finding some memory issues. I've boiled it down to the simplest scenario. I use ctypes to call a cuda function which allocates memory on the device and then frees it. This works fine, but if I then try to use np.dot on a totally other array declared in python, I get a segmentation fault. Note this only happens if the numpy array is sufficiently large. If I change the cuda mallocs to simple c mallocs, all the problems go away, but thats not really helpful. Any ideas what's going on here? CUDA CODE (debug.cu): #include <stdio.h> #include <stdlib.h> extern "C" { void all_together( size_t N) { void*d; int size = N *sizeof(float); int err; err = cudaMalloc(&d, size); if (err != 0) printf("cuda malloc error: %d\n", err); err = cudaFree(d); if (err != 0) printf("cuda free error: %d\n", err); }} PYTHON CODE (master.py): import numpy as np import ctypes from ctypes import * dll = ctypes.CDLL('./cuda_lib.so', mode=ctypes.RTLD_GLOBAL) def build_all_together_f(dll): func = dll.all_together func.argtypes = [c_size_t] return func __pycu_all_together = build_all_together_f(dll) if __name__ == '__main__': N = 5001 # if this is less, the error doesn't show up a = np.random.randn(N).astype('float32') da = __pycu_all_together(N) # toggle this line on/off to get error #np.dot(a, a) print 'end of python' COMPILE: nvcc -Xcompiler -fPIC -shared -o cuda_lib.so debug.cu RUN: python master.py -- View this message in context: http://numpy-discussion.10968.n7.nabble.com/numpy-dot-causes-segfault-after-... Sent from the Numpy-discussion mailing list archive at Nabble.com.