[Numpy-discussion] bad CRC errors when using np.savez, only sometimes though!

Isaac Gerg isaac.gerg at gergltd.com
Fri May 14 16:46:19 EDT 2021


Is it zlib or zipfile?

On Fri, May 14, 2021 at 11:38 AM Benjamin Root <ben.v.root at gmail.com> wrote:

> Isaac,
>
> What I mean is that your bug might be similar to the savemat() bug that
> was fixed in scipy in 2019. Completely different functions, but both
> functions need to properly interact with zlib in order to work properly.
>
> On Fri, May 14, 2021 at 10:22 AM Isaac Gerg <isaac.gerg at gergltd.com>
> wrote:
>
>> Hi Ben,  I am not sure.  However, in looking at the dates, it looks like
>> that was fixed in scipy as of 2019.
>>
>> Would you recommend using the scipy save interface as opposed to the
>> numpy one?
>>
>> On Fri, May 14, 2021 at 10:16 AM Benjamin Root <ben.v.root at gmail.com>
>> wrote:
>>
>>> Perhaps it is a similar bug as this one?
>>> https://github.com/scipy/scipy/issues/6999
>>>
>>> Basically, it turned out that the CRC was getting computed on an
>>> unflushed buffer, or something like that.
>>>
>>> On Fri, May 14, 2021 at 10:05 AM Isaac Gerg <isaac.gerg at gergltd.com>
>>> wrote:
>>>
>>>> I am using 1.19.5 on Windows 10 using Python 3.8.6
>>>> (tags/v3.8.6:db45529, Sep 23 2020, 15:52:53) [MSC v.1927 64 bit (AMD64)].
>>>>
>>>> I have two python processes running (i.e. no threads) which do
>>>> independent processing jobs and NOT writing to the same directories.  Each
>>>> process runs for 5-10 hours and then writes out a ~900MB npz file
>>>> containing 4 arrays.
>>>>
>>>> When I go back to read in the npz files, I will sporadically get bad
>>>> CRC errors which are related to npz using ziplib.  I cannot figure out why
>>>> this is happening.  Looking through online forums, other folks have had CRC
>>>> problems but they seem to be isolated to specifically using ziblib, not
>>>> numpy.  I have found a few mentions though of ziplib causing headaches if
>>>> the same file pointer is used across calls when one uses the file handle
>>>> interface to ziblib as opposed to passing in a filename.'
>>>>
>>>> I have verified with 7zip that the files do in fact have a CRC error so
>>>> its not an artifact of the ziblib.  I have also used the file handle
>>>> interface to np.load and still get the error.
>>>>
>>>> Aside from writing my own numpy storage file container, I am stumped as
>>>> to how to fix this, or reproduce this in a consistent manner.
>>>> Any suggestions would be greatly appreciated!
>>>>
>>>> Thank you,
>>>> Isaac
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at python.org
>>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210514/dede81ba/attachment-0001.html>


More information about the NumPy-Discussion mailing list