(sorry for the length, details/discussion below)
On the triage call, there seemed a preference to just try to skip the
deprecation and introduce `copy="never"`, `copy="if_needed"`, and
`copy="always"` (i.e. string options for the `copy` keyword argument).
Strictly speaking, this is against the typical policy (one year of
warning/errors). But nobody could think of a reasonable chance that
anyone actually uses it. (For me just "policy" will be enough of an
argument to just take it slow.)
BUT: If nobody has *any* concerns at all, I think we may just end up
introducing the change right away.
The PR is: https://github.com/numpy/numpy/pull/19173
## The Feature
There is the idea to add `copy=never` (or similar). This would modify
the existing `copy` argument to make it a 3-way decision:
* `copy=always` or `copy=True` to force a copy
* `copy=if_needed` or `copy=False` to prefer no-copy behavior
* `copy=never` to error when no-copy behavior is not possible
(this ensures that a view is returned)
this would affect the functions:
* np.array(object, copy=...)
* arr.astype(new_dtype, copy=...)
* np.reshape(arr, new_shape, copy=...), and the method arr.reshape()
* np.meshgrid and possibly
Where `reshape` currently does not have the option and would benefit by
allowing for `arr.reshape(-1, copy=never)`, which would guarantee a
## The Options
We have three options that are currently being discussed:
1. We introduce a new `np.CopyMode` or `np.<something>.Copy` Enum
with values `np.CopyMode.NEVER`, `np.CopyMode.IF_NEEDED`, and
* Plus: No compatibility concerns
* Downside(?): This would be a first in NumPy, and is untypical
API due to that.
2. We introduce `copy="never"`, `copy="if_needed"` and `copy="always"`
as strings (all other strings will be a `TypeError`):
* Problem: `copy="never"` currently means `copy=True` (the opposite)
Which means new code has to take care when it may run on
older NumPy versions. And in theory could make old code
return the wrong thing.
* Plus: Strings are the typical for options in NumPy currently.
3. Same as 2. But we take it very slow: Make strings an error right now
and only introduce the new options after two releases as per typical
We discussed it briefly today in the triage call and we were leaning
I was honestly expecting to converge to option 3 to avoid compatibility
issues (mainly surprises with `copy="never"` on older versions).
But considering how weird it is to currently pass `copy="never"`, the
question was whether we should not change it with a release note.
The probability of someone currently passing exactly one of those three
(and no other) strings seems exceedingly small.
Personally, I don't have a much of an opinion. But if *nobody* voices
any concern about just changing the meaning of the string inputs, I
think the current default may be to just do it.
Numpy v1.21.2 <https://github.com/numpy/numpy/releases/tag/v1.21.2> added
support for windows/arm64 platforms but we still don't have any systems in
place to produce binary wheels or test win/arm64 packages. I think it will
be good to start looking into this. CPython has an official buildbot worker
running for win/arm64 and official python support for the platform will be
available from the 3.11 release.
It is not yet clear to me how the build and CI system for numpy is deployed
and how to enable support for a new platform like win/arm64.
One of the main issues in supporting win/arm64 build would be due to the
lack of win/arm64 VMs available on the cloud. But I see we have been
producing binary wheels for Apple M1 platforms on pypi and conda repository
for some time which also lacks the cloud VM support. I think we could take
some learnings from Apple M1 support and look at how a similar strategy can
be used for win/arm64.
I would like to hear if anyone has any thoughts on this topic. Also, any
pointers to understand numpy wheel generation and CI flow for similar
platforms would be helpful as well.
In regard to Feature Request: https://github.com/numpy/numpy/issues/16469
It was suggested to sent to the mailing list. I think I can make a strong
point as to why the support for this naming convention would make sense.
Such as it would follow other frameworks that often work alongside numpy
such as tensorflow. For backward compatibility, it can simply be an alias
I often convert portions of code from tf to np, it is as simple as changing
the base module from tf to np. e.g. np.expand_dims -> tf.expand_dims. This
is done either in debugging (e.g. converting tf to np without eager
execution to debug portion of the code), or during prototyping, e.g.
develop in numpy and convert in tf.
I find myself more than at one occasion to getting syntax errors because of
this particular function np.concatenate. It is unnecessarily long. I
imagine there are more people that also run into the same problems. Pandas
uses concat (torch on the other extreme uses simply cat, which I don't
think is as descriptive).
FYI, I noticed this package that claimed to be maintained by us:
https://pypi.org/project/numpy-aarch64/. That's not ours, so I tried to
contact the author (no email provided, but guessed the same username on
GitHub) and asked to remove it:
There are a very large number of packages with "numpy" in the name on PyPI,
and there's no way we can audit/police that effectively, but if it's a
rebuild that pretends like it's official then I think it's worth doing
something about. It could contain malicious code for all we know.
As of today, our participation in the Google Season of Docs program for
2021 has ended. You can see the case study detailing the work done and some
key results in the following link:
There is one final tutorial in review, but overall the project has been
I want to personally thank Mukulika Pahari for her hard work and excellent
contributions. She was able to quickly produce relevant documentation on
subjects that are not easy or simple. Well done! We hope you stick around
and continue working with us :)
I also want to thank Ross Barnowski for co-mentoring and all the other
maintainers who helped with ideas and reviews and hope we can participate
again next year.
It would be nice to be able to use the Python syntax we already use to
format the precision of floating numbers in numpy:
>>> a = np.array([-np.pi, np.pi])
This is particularly useful when you have large arrangements. The problem
is that if you want to do it today, it is not implemented:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported format string passed to numpy.ndarray.__format__
In this PR (https://github.com/numpy/numpy/pull/19550) I propose a very
basic formatting implementation for numeric numbers that uses
`array2string` just like it currently does `str`
At first, since we are only considering formatting the numeric type,
floating numbers specifically, we are only interested in being able to
change the precision, the sign, and possibly the rounding or truncation.
Since the `array2string` function already does everything we need, we only
need to implement the` __format__` function of the `ndarray` class which
parses a predefined format (similar to the one already used by Python for
built-in data types) to indicate the parameters before said.
I propose a mini format specification inspired in the [Format Specification
format_spec ::= [sign][.precision][type]
sign ::= "+" | "-" | " "
precision ::= [0-9]+
type ::= "f" | "e"
We are going to consider only 3 arguments of the `array2string` function:`
precision`, `suppress_small`,` sign`. In particular, the `type` token sets
the` suppress_small` argument to True when the type is `f` and False when
it is `e`. This is in order to mimic Python's behavior in truncating
decimals when using the fixed-point notation.
As @brandon-rhodes said in gh-5543, the behavior when you try to format an
array containing Python objects, the behavior should be the same as Python
has implemented by default in the `object` class: ` format (a, "") ` should
be equivalent to `str (a)` and `format(a, "not empty")` should raise an
What remains to be defined is the behavior when trying to format an array
with a non-numeric data type (`np.numeric`) other than `np.object_`. Should
we raise an exception? In my opinion yes, since in the future formatting is
extended -- for example, for dates -- people are aware that before that was
I'm open to suggestions.
the `np.ndenumerate` does not work well for masked arrays (like many
main namespace functions, it simply ignores/drops the mask).
There is a PR (https://github.com/numpy/numpy/pull/20020) to add a
version of it to `np.ma` (masked array specific). And we thought it
seemed reasonable and were planning on putting it in.
This version skips all masked elements. An alternative could be to
return `np.ma.masked` for masked elements?
So if anyone thinks that may be the better solution, please send a
(Personally, I don't have opinions on masked arrays for the most part.)
If you have an array built up out of method chaining, sometimes you need to filter it at the very end. This can be annoying because it means you have to create a temporary variable just so you can refer to it in the indexing square brackets:
_temp = long_and_complicated_expression()
result = _temp[_temp >= 0]
You could also use the walrus operator but this is odd looking and it still pollutes the namespace:
result = (_temp := long_and_complicated_expression())[_temp >= 0]
What I would like is to be able to use a lambda inside the indexing square brackets, which would take the whole array as an argument and give a boolean array:
result = long_and_complicated_expression()[lambda arr: arr >= 0]
I should emphasize, the lambda gets the entire array as its argument, and returns an entire mask array of bools. It isn't like the `map` and `filter` builtins where it would call the python function once for each element and thus be slow.
Pandas already has something similar; you can pass a lambda into `.loc` that takes a Series and returns a boolean indexer.