[Numpy-discussion] numpy.pad -- problem?

Andras Deak deak.andris at gmail.com
Sun Apr 29 17:38:23 EDT 2018


PS. my exact numbers are different from yours (probably a
multithreaded thing?), but `ypad[:-2].mean()` agrees with the last 3
elements in `ypad` in my case and I'm sure this is true for yours too.

On Sun, Apr 29, 2018 at 11:36 PM, Andras Deak <deak.andris at gmail.com> wrote:
>> mean(y):  -1.3778013372117948e-16
>> ypad:
>>  [-1.37780134e-16 -1.37780134e-16 -1.37780134e-16  0.00000000e+00
>>   3.09016994e+00  5.87785252e+00  8.09016994e+00  9.51056516e+00
>>   1.00000000e+01  9.51056516e+00  8.09016994e+00  5.87785252e+00
>>   3.09016994e+00  1.22464680e-15 -3.09016994e+00 -5.87785252e+00
>>  -8.09016994e+00 -9.51056516e+00 -1.00000000e+01 -9.51056516e+00
>>  -8.09016994e+00 -5.87785252e+00 -3.09016994e+00 -2.44929360e-15
>>  -7.40148683e-17 -7.40148683e-17]
>>
>> The left pad is correct, but the right pad is different and not the mean of
>> y)  --- why?
>
> This is how np.pad computes mean padding:
> https://github.com/numpy/numpy/blob/01541f2822d0d4b37b96f6b42e35963b132f1947/numpy/lib/arraypad.py#L1396-L1400
> elif mode == 'mean':
>     for axis, ((pad_before, pad_after), (chunk_before, chunk_after)) \
>     in enumerate(zip(pad_width, kwargs['stat_length'])):
>     newmat = _prepend_mean(newmat, pad_before, chunk_before, axis)
>     newmat = _append_mean(newmat, pad_after, chunk_after, axis)
>
> That is, first the mean is prepended, then appended, and in the latter
> step the updates (front-padded) array is used for computing the mean
> again. Note that with arbitrary precision this is fine, since
> appending n*`mean` to an array with mean `mean` should preserve the
> mean. But with doubles you can get errors on the order of the machine
> epsilon, which is what happens here:
>
> In [16]: ypad[3:-2].mean()
> Out[16]: -1.1663302849022412e-16
>
> In [17]: ypad[:-2].mean()
> Out[17]: -3.700743415417188e-17
>
> So the prepended values are `y.mean()`, but the appended values are
> `ypad[:-2].mean()` which includes the near-zero padding values. I
> don't think this error should be a problem in practice, but I agree
> it's surprising.
>
> András


More information about the NumPy-Discussion mailing list