![](https://secure.gravatar.com/avatar/79c44e1ac20fdd391616a8046f16beb6.jpg?s=120&d=mm&r=g)
Dear all, My basic problem is that I would like to compute distances between vectors with missing values. You can find more detail in my question on SO (http://stackoverflow.com/questions/24781461/compute-the-pairwise-distance-in...). Since it seems this is not directly possible with scipy at the moment, I started to Cythonize my function. Currently, the below function is not much faster than my pure Python implementation, so I thought I'd ask the experts here. *Note that even though I'm computing the euclidean distance, I'd like to make use of different distance metrics. * So my current attempt at Cythonizing is: import numpy cimport numpy cimport cython from numpy.linalg import norm numpy.import_array() @cython.boundscheck(False) @cython.wraparound(False) def masked_euclidean(numpy.ndarray[numpy.double_t, ndim=2] data): cdef Py_ssize_t m = data.shape[0] cdef Py_ssize_t i = 0 cdef Py_ssize_t j = 0 cdef Py_ssize_t k = 0 cdef numpy.ndarray[numpy.double_t] dm = numpy.zeros(m * (m - 1) // 2, dtype=numpy.double) cdef numpy.ndarray[numpy.uint8_t, ndim=2, cast=True] mask = numpy.isfinite(data) # boolean for i in range(m - 1): for j in range(i + 1, m): curr = numpy.logical_and(mask[i], mask[j]) u = data[i][curr] v = data[j][curr] dm[k] = norm(u - v) k += 1 return dm Maybe the lack of speed-up is due to the Python function 'norm'? So my question is, how to improve the Cython implementation? Or is there a completely different way of approaching this problem? Thanks in advance, Moritz