[Chicago] sha1 of equal arrays different?
Oren Livne
livne at uchicago.edu
Tue Oct 9 14:34:02 CEST 2012
With buffer() I get TypeError: single-segment buffer object expected
Here's an isolated test case, does it fail on your machine too? (Try to
rerun a few times)
import unittest, numpy as np
from numpy.ma.testutils import assert_equal
from numpy import random
from hashlib import sha1
class TestHashable(unittest.TestCase):
#---------------------------------------------
# Test Methods
#---------------------------------------------
def test_hashes_of_equal_objects_from_slice_are_equal(self):
'''Test reproducability of hashing boolean rows of a matrix
obtained from slicing.'''
#h = np.mod(np.arange(0, 100).reshape(100, 3), 2)
h = random.randint(0, 2, (4, 100, 2))
h = h[:, :, 1].transpose()
h = h[:, np.arange(0, 3)]
h = (h == 1)
print h
def get_index(h):
for j in range(1, h.shape[0]):
if np.all(h[j] == h[0]):
return j
return None
j = get_index(h)
assert_equal(h[0], h[j], 'Test objects should be equal, if not,
rerun test')
def hash_array(x): return
sha1(buffer(x.view(np.uint8))).hexdigest()
#def hash_array(x): x.tostring()
assert_equal(hash_array(h[0]), hash_array(h[j]), 'Hash should
be reproducible')
Thanks,
Oren
On 10/8/2012 10:44 PM, Kyle Cronan wrote:
> I don't think I'm seeing that with my numpy here but maybe try passing
> the view to buffer() before hashing?
>
> -Kyle
>
> On Mon, Oct 8, 2012 at 10:07 PM, Oren Livne <livne at uchicago.edu> wrote:
>> Dear All,
>>
>> I have a boolean numpy matrix whose rows 1 and 2 are equal. When I hash them
>> using astype(uint8), I get equal digests, but different digests if
>> view(uint8) is used. Why is that, or how can I interrogate the internals of
>> the arrays to see why this happens? I rather use view(), because it's faster
>> and requires no additional allocation. In IPython on top of Python 2.7.3.
>>
>> h[1]
>> [ True False False False True True False True True True]
>> h[2]
>> [ True False False False True True False True True True]
>> sha1(h[1].astype(uint8)).hexdigest(), sha1(h[2].astype(uint8)).hexdigest()
>> ('bc4ec9b78ad53ce78ae24643d8ff302a9ae03bd0',
>> 'bc4ec9b78ad53ce78ae24643d8ff302a9ae03bd0')
>> sha1(h[1].view(uint8)).hexdigest(), sha1(h[2].view(uint8)).hexdigest()
>> ('887ed6519d5b521fe5181f9329b98c65897a6a47',
>> 'd8cb41ec47d6e01c25986bbd605d39e777e5b4f7')
>>
>> Thank you so much,
>> Oren
>>
