[Numpy-discussion] checksum on numpy float array

Robert Kern robert.kern at gmail.com
Thu Dec 4 18:52:42 EST 2008


On Thu, Dec 4, 2008 at 17:43, Brennan Williams
<brennan.williams at visualreservoir.com> wrote:
> josef.pktd at gmail.com wrote:
>> On Thu, Dec 4, 2008 at 6:17 PM, Brennan Williams
>> <brennan.williams at visualreservoir.com> wrote:
>>
>>> My app reads in one or more float arrays from a binary file.
>>>
>>> Sometimes due to network timeouts etc the array is not read correctly.
>>>
>>> What would be the best way of checking the validity of the data?
>>>
>>> Would some sort of checksum approach be a good idea?
>>> Would that work with an array of floating point values?
>>> Or are checksums more for int,byte,string type data?
>>>
>>>
>>
>> If you want to verify the file itself, then python provides several
>> more or less secure checksums, my experience was that zlib.crc32 was
>> pretty fast on moderate file sizes. crc32 is common inside archive
>> files and for binary newsgroups. If you have large files transported
>> over the network, e.g. GB size, I would work with par2 repair files,
>> which verifies and repairs at the same time.
>>
>>
> The file has multiple arrays stored in it.
>
> So I want to have some sort of validity check on just the array that I'm
> reading.

So do it on the bytes of the individual arrays. Just don't bother
implementing new type-specific checksums.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list