numpy.load and gzip file handles
Hi everyone. I'd like to log the state of my program as it progresses. Using the numpy.save / numpy.load functions on the same filehandle repeatedly works very well for this -- but ends up making a file which very quickly grows to gigabytes. The data compresses well, though, so I thought I'd use Python's built-in gzip module underneath. This works great for saving -- but when it comes time to play back, there's an issue:
import numpy import gzip f=open("test.gz") g=gzip.GzipFile(None,"rb",9,f) g
numpy.load(g) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/site-packages/numpy/lib/io.py", line 133, in load fid.seek(-N,1) # back-up TypeError: seek() takes exactly 2 arguments (3 given)
Turns out you can't rewind gzip file handles in Python. Oops. The offending code is that which distinguishes between npy and npz files. Could there maybe be something added to just trust me that it's an npy? Or better yet, is there something I'm doing wrong / overlooking? Thanks! -- Matthew Miller mattdm@mattdm.org http://mattdm.org/
2009/2/2 Matthew Miller
I'd like to log the state of my program as it progresses. Using the numpy.save / numpy.load functions on the same filehandle repeatedly works very well for this -- but ends up making a file which very quickly grows to gigabytes. The data compresses well, though, so I thought I'd use Python's built-in gzip module underneath. This works great for saving -- but when it comes time to play back, there's an issue:
import numpy import gzip f=open("test.gz") g=gzip.GzipFile(None,"rb",9,f) g
numpy.load(g) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.5/site-packages/numpy/lib/io.py", line 133, in load fid.seek(-N,1) # back-up TypeError: seek() takes exactly 2 arguments (3 given)
The GzipFile in Python 2.5 does not support the 2nd ("whence") argument. The solution may be to use this wrapper from the EffBot: http://effbot.org/librarybook/gzip-example-2.py In order to "back-port" that functionality. Regards Stéfan
On Mon, Feb 02, 2009 at 08:01:54AM +0200, Stéfan van der Walt wrote:
The GzipFile in Python 2.5 does not support the 2nd ("whence") argument. The solution may be to use this wrapper from the EffBot: http://effbot.org/librarybook/gzip-example-2.py In order to "back-port" that functionality.
Unless I'm misunderstanding, even with the wrapper one can't actually seek backwards, which is what the numpy code wants to do. In the meantime, I'm just using numpy.lib.format.read_array() directly. -- Matthew Miller mattdm@mattdm.org http://mattdm.org/
participants (2)
-
Matthew Miller
-
Stéfan van der Walt