TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that? Hi all, I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars): https://numpy.org/neps/nep-0050-scalar-promotion.html In relation to that, there was one point that Stéfan brought up previously. The NumPy scalars (representation) currently print as numbers: >>> np.float32(34.3) 34.3 >>> np.uint8(5) 5 That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP). The change would be that we should print as: float64(34.3) (or similar?) This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it. If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1]. Cheers, Sebastian [1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
Hello Sebastian, I rarely use NumPy scalars directly, but the repr change could have impact in assorted downstream projects' documentation. For clarity, this idea would not alter how NumPy arrays print, would it - since they already include the type information?
np.array([34.3, 10.1, -0.5], np.float32) array([34.3, 10.1, -0.5], dtype=float32) np.array([5, 10, 0], np.uint8) array([ 5, 10, 0], dtype=uint8)
Thanks, Peter On Thu, Sep 8, 2022 at 10:42 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: p.j.a.cock@googlemail.com
On Thu, 2022-09-08 at 10:53 +0100, Peter Cock wrote:
Hello Sebastian,
I rarely use NumPy scalars directly, but the repr change could have impact in assorted downstream projects' documentation.
For clarity, this idea would not alter how NumPy arrays print, would it - since they already include the type information?
Yes. Array representation is not confusing in the same way. You are right of course. Documentation would be affected quite heavily and would require a lot of docs to be fixed up unfortunately. My hope would be that there is little impact besides documentation, but I am not certain. - Sebastian
np.array([34.3, 10.1, -0.5], np.float32) array([34.3, 10.1, -0.5], dtype=float32) np.array([5, 10, 0], np.uint8) array([ 5, 10, 0], dtype=uint8)
Thanks,
Peter
On Thu, Sep 8, 2022 at 10:42 AM Sebastian Berg < sebastian@sipsolutions.net> wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: p.j.a.cock@googlemail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Thu, 8 Sept 2022, 19:42 Sebastian Berg, <sebastian@sipsolutions.net> wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
From the Python documentation on repr:
From the Python documentation on repr: "this should look like a valid Python expression that could be used to recreate an object with the same value" I think it definitely we should definitely have: repr(np.float32(34.3)) == 'float32(34.3)' And str(np.float32(34.3)) == '34.3' It seems buglike not to have that.
On 9/8/22, Andrew Nelson <andyfaff@gmail.com> wrote:
On Thu, 8 Sept 2022, 19:42 Sebastian Berg, <sebastian@sipsolutions.net> wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
I like the idea, but as others have noted, this could result in a lot of churn in the docs of many projects.
From the Python documentation on repr:
From the Python documentation on repr:
"this should look like a valid Python expression that could be used to recreate an object with the same value"
To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval();
Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.) If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits): ``` In [161]: longpi = np.longdouble('3.14159265358979323846') In [162]: longpi Out[162]: 3.1415926535897932385 In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116 In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ``` Warren
I think it definitely we should definitely have:
repr(np.float32(34.3)) == 'float32(34.3)' And str(np.float32(34.3)) == '34.3'
It seems buglike not to have that.
On 9/9/22 04:15, Warren Weckesser wrote:
... To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be perfectly roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations. Matti
I am in favor of such a change. It will make what is returned more transparent to users (and reduce confusion for newcomers). With NEP50, we're already adopting a philosophy of explicit scalar usage anyway: no longer pretending or trying to make transparent that Python floats and NumPy floats are the same. No one *actually* round-trips objects via repr, but if a user could look at a result and know how to construct the object, that is an improvement. Stéfan On Thu, Sep 8, 2022, at 22:26, Matti Picus wrote:
On 9/9/22 04:15, Warren Weckesser wrote:
... To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be perfectly roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations.
Matti
+1 from me. They are a frequent source of confusion when starting, and there appear to be far fewer now then in earlier releases. It also might make it easier to spot any inadvertent scalars coming out if these could be Python floats. Kevin On Fri, Sep 9, 2022, 07:23 Stefan van der Walt <stefanv@berkeley.edu> wrote:
I am in favor of such a change. It will make what is returned more transparent to users (and reduce confusion for newcomers).
With NEP50, we're already adopting a philosophy of explicit scalar usage anyway: no longer pretending or trying to make transparent that Python floats and NumPy floats are the same.
No one *actually* round-trips objects via repr, but if a user could look at a result and know how to construct the object, that is an improvement.
Stéfan
On Thu, Sep 8, 2022, at 22:26, Matti Picus wrote:
On 9/9/22 04:15, Warren Weckesser wrote:
... To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be perfectly roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations.
Matti
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: kevin.k.sheppard@gmail.com
A naive question: what actually are the differences, what an end user need to worry about when mixing python scalars and numpy scalars? Same question about a library author. Or is it mainly about fixed-width integers vs python integers? Cheers, Evgeni пт, 9 сент. 2022 г., 09:58 Kevin Sheppard <kevin.k.sheppard@gmail.com>:
+1 from me. They are a frequent source of confusion when starting, and there appear to be far fewer now then in earlier releases. It also might make it easier to spot any inadvertent scalars coming out if these could be Python floats.
Kevin
On Fri, Sep 9, 2022, 07:23 Stefan van der Walt <stefanv@berkeley.edu> wrote:
I am in favor of such a change. It will make what is returned more transparent to users (and reduce confusion for newcomers).
With NEP50, we're already adopting a philosophy of explicit scalar usage anyway: no longer pretending or trying to make transparent that Python floats and NumPy floats are the same.
No one *actually* round-trips objects via repr, but if a user could look at a result and know how to construct the object, that is an improvement.
Stéfan
On 9/9/22 04:15, Warren Weckesser wrote:
... To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be
On Thu, Sep 8, 2022, at 22:26, Matti Picus wrote: perfectly
roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations.
Matti
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: kevin.k.sheppard@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: evgeny.burovskiy@gmail.com
On Thu, 2022-09-08 at 23:19 -0700, Stefan van der Walt wrote:
I am in favor of such a change. It will make what is returned more transparent to users (and reduce confusion for newcomers).
With NEP50, we're already adopting a philosophy of explicit scalar usage anyway: no longer pretending or trying to make transparent that Python floats and NumPy floats are the same.
No one *actually* round-trips objects via repr, but if a user could look at a result and know how to construct the object, that is an improvement.
True, the only worry would be loss of a bit of a precision when someone has a copy-paste workflow. But at that point, users probably don't care about the last ULP. Even float32/float16 `repr` use is tricky in principle since float32("<number>") float32(<number>) May not be identical, but I would have to dig into the subtleties there (and presumably NumPy gets it subtly wrong anyway right now!). Not sure it is a big worry, but probably worthwhile to note down if we end up writing a brief NEP. Cheers, Sebastian
Stéfan
On Thu, Sep 8, 2022, at 22:26, Matti Picus wrote:
On 9/9/22 04:15, Warren Weckesser wrote:
... To quote from https://docs.python.org/3/library/functions.html#repr:
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(); Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
As others have mentioned, the change will greatly enhance UX at the cost of documentation cleanups. While the representation may not be perfectly roundtrip-able, I think it still is an improvement and worthwhile. Elsewhere I have suggested we need more documentation around array/scalar printing, perhaps that would be a place to mention the limitations of string representations.
Matti
NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Thu, 2022-09-08 at 21:15 -0400, Warren Weckesser wrote:
On 9/8/22, Andrew Nelson <andyfaff@gmail.com> wrote:
<snip>
For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval();
Sebastian, is this an explicit goal of the change? (Personally, I've gotten used to not taking this too seriously, but my world view is biased by the long-term use of NumPy, which has never followed this guideline.)
To me, that should be mainly a guiding principle maybe not a strict goal. But I do wonder if there are any alternative thoughts on what the representation should be? Since I doubt Python will add infix operators that allow `123.4_f64` soon that is probably not worth a thought :). For booleans I think we previously had rather settled on `np.True_` and `np.False_`. Cheers, Sebastian
If that is a goal, than the floating point types with precision greater than double precision will need to display the argument of the type as a string. For example, the following is run on a platform where numpy.longdouble is extended precision (80 bits):
``` In [161]: longpi = np.longdouble('3.14159265358979323846')
In [162]: longpi Out[162]: 3.1415926535897932385
In [163]: np.longdouble(3.1415926535897932385) # Argument is parsed as 64 bit float Out[163]: 3.141592653589793116
In [164]: np.longdouble('3.1415926535897932385') # Correctly reproduces the longdouble Out[164]: 3.1415926535897932385 ```
Warren
I think it definitely we should definitely have:
repr(np.float32(34.3)) == 'float32(34.3)' And str(np.float32(34.3)) == '34.3'
It seems buglike not to have that.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
Hi all, As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated! The main change is to show scalars as: * `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta. This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged. The NEP draft can be found here: https://numpy.org/neps/nep-0051-scalar-representation.html and it includes more details and related changes. The implementation is largely finished and can be found here: https://github.com/numpy/numpy/pull/22449 W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted. Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval. Cheers, Sebastian On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Fri, Oct 28, 2022 at 10:57 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
These all seem like good ideas to me, thanks for working on this Sebastian. Cheers, Ralf
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ralf.gommers@gmail.com
On Fri, Oct 28, 2022, at 01:54, Sebastian Berg wrote:
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
I very much like the consistency of the `np` everywhere, compared to previous proposals. Thanks, Sebastian! Stéfan
I like this. NumPy scalar printing is confusing to new users, who might think they are Python scalars. And even if you understand them, it's always been annoying that you have to do further introspection to see the dtype. I also like the longdouble change (the name float128 has misled me in the past), and the decision to make everything copy-paste round-trippable. Are there also plans to add "np." to array() and the string forms of other objects? Aaron Meurer On Fri, Oct 28, 2022 at 2:55 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com
On Mon, Oct 31, 2022 at 12:51 PM Aaron Meurer <asmeurer@gmail.com> wrote:
I like this. NumPy scalar printing is confusing to new users, who might think they are Python scalars. And even if you understand them, it's always been annoying that you have to do further introspection to see the dtype. I also like the longdouble change (the name float128 has misled me in the past), and the decision to make everything copy-paste round-trippable.
Agreed, I am strongly supportive of the proposal in this NEP!
Are there also plans to add "np." to array() and the string forms of other objects?
Aaron Meurer
On Fri, Oct 28, 2022 at 2:55 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: shoyer@gmail.com
On Mon, 2022-10-31 at 13:49 -0600, Aaron Meurer wrote:
I like this. NumPy scalar printing is confusing to new users, who might think they are Python scalars. And even if you understand them, it's always been annoying that you have to do further introspection to see the dtype. I also like the longdouble change (the name float128 has misled me in the past), and the decision to make everything copy-paste round-trippable.
Are there also plans to add "np." to array() and the string forms of other objects?
Didn't include changing arrays here, since I thought I would focus on scalars only. Clearly it is a plausible change though, and we could add it in this NEP. Although, I suspect it is just as well to do it separately. Including the `np.` for scalars seemed more clear, but since it is pretty common to exclude modules in repr, I would also be happy to not do it. It would make a bit of a presumption that NumPy is the obvious provider of `int32` (and if others do, they are likely compatible). - Sebastian
Aaron Meurer
On Fri, Oct 28, 2022 at 2:55 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
Hi all, I would like to formally propose accepting NEP 51. Without any concern voiced, we will consider it accepted within 7 days. As a reminder, this is to change the representation of NumPy scalars to be consistent and include the type name. That means the following representations: np.float64(6.4) -> np.float64(6.4) np.float64(np.nan) -> np.float64(nan) rather than just `6.4` or `nan`. All scalars would follow this exact pattern of `np.<type_name>(value)`. There are some further details, for these please check the full NEP: https://numpy.org/neps/nep-0051-scalar-representation.html For those interested in more details, a few notes: * To implement the NEP, we need to update NumPy docs. I plan to automate this (mostly) and such automation should also help others. (I will make a brief note of this in the NEP.) Help with this automation would be greatly appreciated, since this is its own project. * I am not sure that the underscored versions `np.str_` and `np.bool_` will be the correct names for long. If we adjust them, then this would propagate to the NEP. * There are a few implementation details in the NEP, I don't mind adjusting them. But do wish to be pragmatic about progressing if there is no clearly formulated alternative. * Clearly we can always adjust the printing conventions, e.g. whether to include the `np.` or whether NaN's should be `np.float64(nan)` or not. But bike-sheds happening now have a much better chance of being heard :). 1. The current NEP states that we use `np.str_` and `np.bytes_`. There is some chance that the top-level names could be changed, in that case the representation would change accordingly. (I consider this an adjustment we can do without the NEP.) 2. To properly implement the NEP, we need to automate some of the documentation changes necessary. This should also enable downstream to do the same or at least have a blueprint as a starting point. (Help with this work is greatly appreciated, since it is its own small project to hook into the doctest utilities.) I plan on adding a brief note on about helping with doc updates to NEP when accepting it. Ross was planning to add a table of changed examples, although I don't think that is necessary for accepting. Cheers, Sebastian On Fri, 2022-10-28 at 10:54 +0200, Sebastian Berg wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Fri, Nov 25, 2022, at 08:33, Sebastian Berg wrote:
I would like to formally propose accepting NEP 51. Without any concern voiced, we will consider it accepted within 7 days.
+1 We should update the NEP to match any changes made later, like `np.str_` and `np.bool_`, so that we have a good reference document. Stéfan
On Fri, Nov 25, 2022 at 9:36 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
I would like to formally propose accepting NEP 51. Without any concern voiced, we will consider it accepted within 7 days.
As a reminder, this is to change the representation of NumPy scalars to be consistent and include the type name. That means the following representations:
np.float64(6.4) -> np.float64(6.4) np.float64(np.nan) -> np.float64(nan)
rather than just `6.4` or `nan`. All scalars would follow this exact pattern of `np.<type_name>(value)`.
There are some further details, for these please check the full NEP:
https://numpy.org/neps/nep-0051-scalar-representation.html
For those interested in more details, a few notes:
* To implement the NEP, we need to update NumPy docs. I plan to automate this (mostly) and such automation should also help others. (I will make a brief note of this in the NEP.) Help with this automation would be greatly appreciated, since this is its own project.
* I am not sure that the underscored versions `np.str_` and `np.bool_` will be the correct names for long. If we adjust them, then this would propagate to the NEP.
* There are a few implementation details in the NEP, I don't mind adjusting them. But do wish to be pragmatic about progressing if there is no clearly formulated alternative.
* Clearly we can always adjust the printing conventions, e.g. whether to include the `np.` or whether NaN's should be `np.float64(nan)` or not. But bike-sheds happening now have a much better chance of being heard :).
I always prefer things that have copy-pastability, so I would suggest something like np.float64('nan'). It looks like the NEP already does this for longdouble. I'm assuming for float-64 and lower one can guarantee round-tripping with the Python float literal. The "np." also helps for copy-pastability for standard usage so that sounds fine too. It would be useful to allow it to be disabled or customized. For example, some libraries reuse NumPy dtype objects so they may want to replace "np." with their own library name, or just omit it. It wasn't clear to me if something like this is already part of the NEP or not.
1. The current NEP states that we use `np.str_` and `np.bytes_`. There is some chance that the top-level names could be changed, in that case the representation would change accordingly. (I consider this an adjustment we can do without the NEP.)
2. To properly implement the NEP, we need to automate some of the documentation changes necessary. This should also enable downstream to do the same or at least have a blueprint as a starting point. (Help with this work is greatly appreciated, since it is its own small project to hook into the doctest utilities.)
A reusable script would be nice, since many projects are going to have doctests broken by this. I think there also may already be some existing tooling that just "fixes" doctests by making them match their output. Aaron Meurer
I plan on adding a brief note on about helping with doc updates to NEP when accepting it. Ross was planning to add a table of changed examples, although I don't think that is necessary for accepting.
Cheers,
Sebastian
On Fri, 2022-10-28 at 10:54 +0200, Sebastian Berg wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com
On Tue, 2022-11-29 at 14:51 -0700, Aaron Meurer wrote:
On Fri, Nov 25, 2022 at 9:36 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
Hi all,
I would like to formally propose accepting NEP 51. Without any concern voiced, we will consider it accepted within 7 days.
<snip>
* Clearly we can always adjust the printing conventions, e.g. whether to include the `np.` or whether NaN's should be `np.float64(nan)` or not. But bike-sheds happening now have a much better chance of being heard :).
I always prefer things that have copy-pastability, so I would suggest something like np.float64('nan'). It looks like the NEP already does this for longdouble.
It is a bit different for longdouble because for longdouble you should always put quotes anyway. Note that if we do that, we somewhat also have to do it also for: array([1.0, nan, 2.0]) which currently doesn't print quotes. But then we need to include the dtype in the output, or that doesn't round-trip anymore (even with defining `nan`).
I'm assuming for float-64 and lower one can guarantee round-tripping with the Python float literal.
The "np." also helps for copy-pastability for standard usage so that sounds fine too. It would be useful to allow it to be disabled or customized. For example, some libraries reuse NumPy dtype objects so they may want to replace "np." with their own library name, or just omit it. It wasn't clear to me if something like this is already part of the NEP or not.
This changes printing of instances, classes always print as `numpy.uint8` and I am not planning on changing that. I added a way to format a scalars repr when the dtype is known (i.e. the same way as it would be when calling `repr(array)`) in the NEP. I didn't attach it to the scalar though, either way it feels a bit unwieldy, but I don't have a good idea for providing it and `str()` basically does this also.
1. The current NEP states that we use `np.str_` and `np.bytes_`. There is some chance that the top-level names could be changed, in that case the representation would change accordingly. (I consider this an adjustment we can do without the NEP.)
2. To properly implement the NEP, we need to automate some of the documentation changes necessary. This should also enable downstream to do the same or at least have a blueprint as a starting point. (Help with this work is greatly appreciated, since it is its own small project to hook into the doctest utilities.)
A reusable script would be nice, since many projects are going to have doctests broken by this. I think there also may already be some existing tooling that just "fixes" doctests by making them match their output.
Yeah, such a tool should be good enough in practice, do you know where to find it? Otherwise hacking a doctest helper seems very possible. - Sebastian
Aaron Meurer
I plan on adding a brief note on about helping with doc updates to NEP when accepting it. Ross was planning to add a table of changed examples, although I don't think that is necessary for accepting.
Cheers,
Sebastian
On Fri, 2022-10-28 at 10:54 +0200, Sebastian Berg wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: asmeurer@gmail.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
On Fri, 2022-10-28 at 10:54 +0200, Sebastian Berg wrote:
Hi all,
As mentioned earlier, I would like to propose changing the representation of scalars in NumPy. Discussion and ideas on changes are much appreciated!
The main change is to show scalars as:
* `np.float64(3.0)` instead of just `3.0` * `np.True_` instead of `True` * `np.void((3, 5), dtype=[('a', '<i8'), ('b', 'u1')])` instead of `(3, 5)` * Use `np.` rather than `numpy.` for datetime/timedelta.
It has been a while, and it is time now to accept the NEP I think. Unless there is any serious concerns, I will consider it accepted in a week. There was always one piece which I now (mostly) have: Auto-fixing the documentation of NumPy and hopefully at least a good chunk of downstream, based on: https://github.com/scientific-python/pytest-doctestplus This might not be available quite immediately depending on timing, but it only affects doc-testing and presumably no-one is doing that against the nightlies anyway. The one small concern was about not being able to copy paste `np.float64(nan)` without quotes around the nan. I had answered that in the last email of the thread: it is just more pragmatic because of array reprensentation including it isn't ideal: array([1.0, nan]) vs array([1.0, 'nan']) (where the latter doesn't roundtrip without `dtype=`!)
This way it is clear for users that they are dealing with NumPy scalars which behave different from Python scalars. The `str()` that is given when using `print()` and the way arrays are shown will be unchanged.
The NEP draft can be found here:
https://numpy.org/neps/nep-0051-scalar-representation.html
and it includes more details and related changes.
The implementation is largely finished and can be found here:
https://github.com/numpy/numpy/pull/22449
W are fairly late in the release cycle and the change should not block other things. So, the aim is to merge it early in the next release cycle. That way downstream has time to fix documentation is wanted.
Depending on how discussion goes, I hope to formally propose the NEP fairly soon, so that the merging the implementation doesn't need to wait on NEP approval.
Cheers,
Sebastian
On Thu, 2022-09-08 at 11:38 +0200, Sebastian Berg wrote:
TL;DR: NumPy scalars representation is e.g. `34.3` instead of `float32(34.3)`. So the representation is missing the type information. What are your thoughts on changing that?
Hi all,
I am thinking about the next steps for NEP 50 (The NEP wants to fix the NumPy promotion rules, especially with respect to scalars):
https://numpy.org/neps/nep-0050-scalar-promotion.html
In relation to that, there was one point that Stéfan brought up previously.
The NumPy scalars (representation) currently print as numbers:
>>> np.float32(34.3) 34.3 >>> np.uint8(5) 5
That can already be confusing now. However, it gets more problematic if NEP 50 is introduced since the behavior between a Python `34.3` and `np.float32(34.3)` would differ more than it does now (please refer to the NEP).
The change would be that we should print as:
float64(34.3) (or similar?)
This Email is mainly to ask for any feedback or concern on such a change. I suspect we may have to write a very brief NEP about it.
If there is little concern, maybe we could move forward such a change promptly. Otherwise it could be moved forward together with NEP 50 and take effect in a "major" release [1].
Cheers,
Sebastian
[1] Note that for me, even a major release would hopefully not affect the majority of users or be very disruptive.
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
participants (11)
-
Aaron Meurer
-
Andrew Nelson
-
Evgeni Burovski
-
Kevin Sheppard
-
Matti Picus
-
Peter Cock
-
Ralf Gommers
-
Sebastian Berg
-
Stefan van der Walt
-
Stephan Hoyer
-
Warren Weckesser