[Numpy-discussion] type 'numpy.int64' unhashable

Sebastian Haase seb.haase at gmail.com
Fri Oct 30 17:08:38 EDT 2009


On Fri, Oct 30, 2009 at 5:44 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Fri, Oct 30, 2009 at 08:11, James Bergstra <bergstrj at iro.umontreal.ca> wrote:
>> On Fri, Oct 30, 2009 at 7:23 AM, Gael Varoquaux
>> <gael.varoquaux at normalesup.org> wrote:
>>> On Fri, Oct 30, 2009 at 08:21:16PM +0900, David Cournapeau wrote:
>>>> On Fri, Oct 30, 2009 at 8:04 PM, Sebastian Haase <seb.haase at gmail.com> wrote:
>>>
>>>> > I understand where this error comes from, however what I was trying to
>>>> > do seems to "intuitive" that I would like to ask for suggestions:
>>>> > "What should I do if the "number" 2636 becomes unhashable ?"
>>>
>>>> In your example, that's the array which is unhashable, the numbers
>>>> itself should be hashable. Arrays are mutable, so I don't think you
>>>> can easily make them hashable. You could transform everything into
>>>> tuple of tuple of... if you need to use set, though.
>>>
>>> Use md5's of their .data attribute. This works quite well (you might want
>>> to hash a pickled string of the dtype in addition).
>>>
>>> Gaël
>>
>> Careful... if your data is not contiguous in memory then you could be
>> adding lots of random noise to your hash key by doing this.  This
>> could cause equal ndarrays to hash to different values -- not good.
>> Make sure memory is contiguous before hashing the .data.  Flatten()
>> does this i think, as does copy(), array(), and many others.
>
> .data doesn't work for non-contiguous arrays anyways. :-)
>
> But all of this is irrelevant to the OP. First, I cannot replicate his problem.
>
> In [12]: chainsA = np.arange(10, dtype=np.int64)
>
> In [13]: set(chainsA)
> Out[13]: set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>
>
> Second, he seems to be interested in scalar objects, not arrays. The
> scalar objects should all be hashable and comparable out-of-box and
> ready to be used in sets and as dict keys. We will need a complete,
> self-contained example that demonstrates the problem to get any
> further with this.
>
> Third, even if he wanted to use arrays as set elements, he couldn't
> because such objects not only need to have __hash__ defined, they also
> need __eq__ to return a bool. We return boolean arrays that cannot be
> used as a truth value.
>
> Fourth, even if arrays could be compared, you couldn't replace their
> __hash__ method or tell set to use a different function in place of
> the __hash__ method.
>
> Fifth, even if you could tell set to use a different hash function,
> you wouldn't use cryptographic hashes. You would just
> hash(buffer(arr)) for contiguous arrays and hash(arr.tostring()) for
> the rest.
>
> --
> Robert Kern
>
Thanks to everyone for replying. Nice detective work, Robert - indeed
it seems to work with "real" ndarrays -- I have to do some more
homework to get my problem into a shape so that I could demonstrate it
in a "small, self contained form".
Thanks again,

Sebastian



More information about the NumPy-Discussion mailing list