[Numpy-discussion] A bug in numpy.random.shuffle?
Charles R Harris
charlesr.harris at gmail.com
Thu Sep 5 16:06:44 EDT 2013
On Thu, Sep 5, 2013 at 1:45 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:
>
>
>
> On Thu, Sep 5, 2013 at 1:34 PM, Bradley M. Froehle <brad.froehle at gmail.com
> > wrote:
>
>> I put this test case through `git bisect run` and here's what came
>> back. I haven't confirmed this manually yet, but the blamed commit
>> does seem reasonable:
>>
>> b26c675e2a91e1042f8f8d634763942c87fbbb6e is the first bad commit
>> commit b26c675e2a91e1042f8f8d634763942c87fbbb6e
>> Author: Nathaniel J. Smith <njs at pobox.com>
>> Date: Thu Jul 12 13:20:20 2012 +0100
>>
>> [FIX] Make np.random.shuffle less brain-dead
>>
>> The logic in np.random.shuffle was... not very sensible. Fixes trac
>> ticket #2074.
>>
>> This patch also exposes a completely unrelated issue in
>> numpy.testing. Filed as Github issue #347 and marked as knownfail for
>> now.
>>
>> :040000 040000 6f3cf0c85a64664db6a71bd59909903f18b51639
>> 0b6c8571dd3c9de8f023389f6bd963e42b12cc26 M numpy
>> bisect run success
>>
>> On Thu, Sep 5, 2013 at 11:58 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> >
>> > On Thu, Sep 5, 2013 at 12:50 PM, Fernando Perez <fperez.net at gmail.com>
>> > wrote:
>> >>
>> >> On Thu, Sep 5, 2013 at 11:43 AM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >>
>> >>
>> >> > Oh, nice one ;) Should be fixable if you want to submit a patch.
>> >>
>> >> Strategy? One option is to do, for structured arrays, a shuffle of the
>> >> indices and then an in-place
>> >>
>> >> arr = arr[shuffled_indices]
>> >>
>> >> But there may be a cleaner/faster way to do it.
>> >>
>> >> I'm happy to submit a patch, but I'm not familiar enough with the
>> >> internals to know what the best approach should be.
>> >>
>> >
>> > Better open an issue. It looks like a bug in the indexing code.
>> >
>>
>
> Also fails for string arrays.
>
> In [6]: x = np.zeros(5, dtype=[('n', 'S1'), ('s', 'S1')])
>
> In [7]: x['s'] = [c for c in 'abcde']
>
> In [8]: x
> Out[8]:
> array([('', 'a'), ('', 'b'), ('', 'c'), ('', 'd'), ('', 'e')],
> dtype=[('n', 'S1'), ('s', 'S1')])
>
> In [9]: x[0], x[1] = x[1], x[0]
>
> In [10]: x
> Out[10]:
> array([('', 'b'), ('', 'b'), ('', 'c'), ('', 'd'), ('', 'e')],
> dtype=[('n', 'S1'), ('s', 'S1')])
>
>
This behavior is not new, it is also present in 1.6.x
In [1]: x = np.zeros(5, dtype=[('n', 'S1'), ('s', 'S1')])
In [2]: x['s'] = [c for c in 'abcde']
In [3]: x
Out[3]:
array([('', 'a'), ('', 'b'), ('', 'c'), ('', 'd'), ('', 'e')],
dtype=[('n', '|S1'), ('s', '|S1')])
In [4]: x[0], x[1] = x[1], x[0]
In [5]: x
Out[5]:
array([('', 'b'), ('', 'b'), ('', 'c'), ('', 'd'), ('', 'e')],
dtype=[('n', '|S1'), ('s', '|S1')])
In [6]: np.__version__
Out[6]: '1.6.3.dev-3f58621'
So it looks like it needs to be decided if this is a bug or not. I think
the returned scalars should be copies of the data.
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130905/8bde97a2/attachment.html>
More information about the NumPy-Discussion
mailing list