Is it feasible/desirable to provide a doctest runner that ignores whitespace? That would allow downstream projects to fix their doctests on 1.14+ with a one-line change, without breaking tests on 1.13.

On Fri, Jun 30, 2017 at 11:11 AM, Allan Haldane <allanhaldane@gmail.com> wrote:
On 06/30/2017 03:55 AM, Juan Nunez-Iglesias wrote:
To reiterate my point on a previous thread, I don't think this should happen until NumPy 2.0. This *will* break a massive number of doctests, and what's worse, it will do so in a way that makes it difficult to support doctesting for both 1.13 and 1.14. I don't see a big enough benefit to these changes to justify breaking everyone's tests before an API-breaking version bump.

I am still on the fence about exactly how annoying this change would be, and it is is good to hear whether this affects you and how badly.

Yes, someone would have to spend an hour removing a hundred spaces in doctests, and the 1.13 to 1.14 period is trickier (but virtualenv helps). But none of your end users are going to have their scripts break, there are no new warnings or exceptions.

A followup questions is, to what degree can we compromise? Would it be acceptable to skip the big change #1, but keep the other 3 changes? I expect they affect far fewer doctests. Or, for instance, I could scale back #1 so it only affects size-1 (or perhaps, only size-0) arrays. What amount of change would be OK, and how is changing a small number of doctests different from changing more?

Also, let me clarify the motivations for the changes. As Marten noted, change #2 is what motivated all the other changes. Currently 0d arrays print in their own special way which was making it very hard to implement fixes to voidtype str/repr, and the datetime and other 0d reprs are weird. The fix is to make 0d arrays print using the same code-path as higher-d ndarrays, but then we ended up with reprs like "array( 1.)" because of the space for the sign position. So I removed the space from the sign position for all float arrays. But as I noted I probably could remove it for only size-1 or 0d arrays and still fix my problem, even though I think it might be pretty hacky to implement in the numpy code.

Allan





On 30 Jun 2017, 6:42 AM +1000, Marten van Kerkwijk <m.h.vankerkwijk@gmail.com>, wrote:
To add to Allan's message: point (2), the printing of 0-d arrays, is
the one that is the most important in the sense that it rectifies a
really strange situation, where the printing cannot be logically
controlled by the same mechanism that controls >=1-d arrays (see PR).

While point 3 can also be considered a bug fix, 1 & 4 are at some
level matters of taste; my own reason for supporting their
implementation now is that the 0-d arrays already forces me (or,
specifically, astropy) to rewrite quite a few doctests, and I'd rather
have everything in one go -- in this respect, it is a pity that this
is separate from the earlier change in printing for structured arrays
(which was also much for the better, but broke a lot of doctests).

-- Marten



On Thu, Jun 29, 2017 at 3:38 PM, Allan Haldane <allanhaldane@gmail.com> wrote:
Hello all,

There are various updates to array printing in preparation for numpy
1.14. See https://github.com/numpy/numpy/pull/9139/

Some are quite likely to break other projects' doc-tests which expect a
particular str or repr of arrays, so I'd like to warn the list in case
anyone has opinions.

The current proposed changes, from most to least painful by my
reckoning, are:

1.
For float arrays, an extra space previously used for the sign position
will now be omitted in many cases. Eg, `repr(arange(4.))` will now
return 'array([0., 1., 2., 3.])' instead of 'array([ 0., 1., 2., 3.])'.

2.
The printing of 0d arrays is overhauled. This is a bit finicky to
describe, please see the release note in the PR. As an example of the
effect of this, the `repr(np.array(0.))` now prints as 'array(0.)`
instead of 'array(0.0)'. Also the repr of 0d datetime arrays is now like
"array('2005-04-04', dtype='datetime64[D]')" instead of
"array(datetime.date(2005, 4, 4), dtype='datetime64[D]')".

3.
User-defined dtypes which did not properly implement their `repr` (and
`str`) should do so now. Otherwise it now falls back to
`object.__repr__`, which will return something ugly like
`<mytype object at 0x7f37f1b4e918>`. (Previously you could depend on
only implementing the `item` method and the repr of that would be
printed. But no longer, because this risks infinite recursions.).

4.
Bool arrays of size 1 with a 'True' value will now omit a space, so that
`repr(array([True]))` is now 'array([True])' instead of
'array([ True])'.

Allan
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion