[Numpy-discussion] R: R: R: R: fast numpy.fromfile skipping data chunks

Andrea Cimatoribus Andrea.Cimatoribus at nioz.nl
Wed Mar 13 11:13:50 EDT 2013

Ok, this seems to be working (well, as soon as I get the right offset and things like that, but that's a different story).
The problem is that it does not go any faster than my initial function compiled with cython, and it is still a lot slower than fromfile. Is there a reason why, even with compiled code, reading from a file skipping some records should be slower than reading the whole file?

Da: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] per conto di Nathaniel Smith [njs at pobox.com]
Inviato: mercoledì 13 marzo 2013 15.53
A: Discussion of Numerical Python
Oggetto: Re: [Numpy-discussion] R: R: R: fast numpy.fromfile skipping data      chunks

On Wed, Mar 13, 2013 at 2:46 PM, Andrea Cimatoribus
<Andrea.Cimatoribus at nioz.nl> wrote:
>>Indeed, but that offset "it should be a multiple of the byte-size of dtype" as the help says.
> My mistake, sorry, even if the help says so, it seems that this is not the case in the actual code. Still, the problem with the size of the available data (which is not necessarily a multiple of dtype byte-size) remains.

Worst case you can always work around such issues with an extra layer
of view manipulation:

# create a raw view onto the contents of the file
file_bytes = np.memmap(path, dtype=np.uint8, ...)
# cut out any arbitrary number of bytes from the beginning and end
data_bytes = file_bytes[...some slice expression...]
# switch to viewing the bytes as the proper data type
data = data_bytes.view(dtype=np.uint32)
# proceed as before

NumPy-Discussion mailing list
NumPy-Discussion at scipy.org

More information about the NumPy-Discussion mailing list