Hello all,
I'm only two weeks late with this message about my pull request
<https://github.com/numpy/numpy/pull/18707> that adds functions that allow
the user to pad to a target shape, instead of adding a certain amount to
each axis.
For example:
x = np.ones((3, 3))
# We want an output shape of (10, 10)
padded = np.pad_to_shape(x, (10, 10))
print(padded.shape)
prints: (10, 10)
Whereas with the current implementation of np.pad you'd need to first
figure out the difference in axis sizes. The proposed function would
perform this step for you. As this function passes to np.pad internally, I
made sure to include the arguments in the signature that np.pad uses for
padding modes.
Finally, I've added a logical extension of this function; pad_to_match,
this takes another array and pads the input array to match.
These calls would be equivalent:
x = np.ones((3, 3))
y = np.zeros((10, 10))
padded_to_shape = np.pad_to_shape(x, y.shape)
padded_to_match = np.pad_to_match(x, y)
For additional functionality description, I refer to the pull request.
I'm not too familiar with mailing lists, so I hope this is how things work.
Kind Regards,
Mathijs
I have submitted NEP 49 to enable userdefined allocation strategies for
the ndarray.data homogeneous memory area. The implementation is in PR
17582 https://github.com/numpy/numpy/pull/17582 Here is the text of the NEP:
Abstract

The ``numpy.ndarray`` requires additional memory allocations
to hold ``numpy.ndarray.strides``, ``numpy.ndarray.shape`` and
``numpy.ndarray.data`` attributes. These attributes are specially allocated
after creating the python object in ``__new__`` method. The ``strides`` and
``shape`` are stored in a piece of memory allocated internally.
This NEP proposes a mechanism to override the memory management strategy
used
for ``ndarray>data`` with userprovided alternatives. This allocation holds
the arrays data and is can be very large. As accessing this data often
becomes
a performance bottleneck, custom allocation strategies to guarantee data
alignment or pinning allocations to specialized memory hardware can enable
hardwarespecific optimizations.
Motivation and Scope

Users may wish to override the internal data memory routines with ones
of their
own. Two such usecases are to ensure data alignment and to pin certain
allocations to certain NUMA cores.
User who wish to change the NumPy data memory management routines will use
:c:func:`PyDataMem_SetHandler`, which uses a :c:type:`PyDataMem_Handler`
structure to hold pointers to functions used to manage the data memory. The
calls are wrapped by internal routines to call
:c:func:`PyTraceMalloc_Track`,
:c:func:`PyTraceMalloc_Untrack`, and will use the
:c:func:`PyDataMem_EventHookFunc` mechanism already present in NumPy for
auditing purposes.
Since a call to ``PyDataMem_SetHandler`` will change the default
functions, but
that function may be called during the lifetime of an ``ndarray``
object, each
``ndarray`` will carry with it the ``PyDataMem_Handler`` struct used at the
time of its instantiation, and these will be used to reallocate or free the
data memory of the instance. Internally NumPy may use ``memcpy` or
``memset``
on the data ``ptr``.
Usage and Impact

The new functions can only be accessed via the NumPy CAPI. An example is
included later in the NEP. The added ``struct`` will increase the size
of the
``ndarray`` object. It is one of the major drawbacks of this approach.
We can
be reasonably sure that the change in size will have a minimal impact on
enduser code because NumPy version 1.20 already changed the object size.
Backward compatibility

The design will not break backward compatibility. Projects that were
assigning
to the ``ndarray>data`` pointer were already breaking the current memory
management strategy (backed by ``npy_alloc_cache``) and should restore
``ndarray>data`` before calling ``Py_DECREF``. As mentioned above, the
change
in size should not impact endusers.
Matti
All,
I am excited to announce the release of MyGrad 2.0.
MyGrad's primary goal is to make automatic differentiation accessible and
easy to use across the NumPy ecosystem (see [1] for more detailed comments).
Source: https://github.com/rsokl/MyGrad
Docs: https://mygrad.readthedocs.io/en/latest/
MyGrad's only dependency is NumPy and (as of version 2.0) it makes keen use
of NumPy's excellent protocols for overriding functions and ufuncs. Thus
you can "drop in" a mygradtensor into your pure NumPy code and compute
derivatives through it.
Ultimately, MyGrad could be extended to bring autodiff to other arraybased
libraries like CuPy, Sparse, and Dask.
For full release notes see [2]. Feedback, critiques, and ideas are welcome!
Cheers,
Ryan Soklaski
[1] MyGrad is not meant to "compete" with the likes of PyTorch and JAX,
which are fantasticallyfast and powerful autodiff libraries. Rather, its
emphasis is on being lightweight and seamless to use in NumPycentric
workflows.
[2] https://mygrad.readthedocs.io/en/latest/changes.html#v200
Hi all,
Our biweekly triagefocused NumPy development meeting is Wednesday,
Arpil 21st at 11 am Pacific Time (18:00 UTC).
Everyone is invited to join in and edit the workinprogress meeting
topics and notes:
https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg
I encourage everyone to notify us of issues or PRs that you feel should
be prioritized, discussed, or reviewed.
Best regards
Sebastian
Hi,
I am new to contributing to open source projects and not sure where to begin
(maybe emailing this distro?). In any case, the main improvement I would
like to work on would be adding multivariate polynomials/differentials to
numpy. I would love any insight into the current status and intensions of
numpy polynomial implementations.
1) why do we have np.polynomial and np.lib.polynomial? which one is older?
is there a desire to see these merged?
2) why does np.polynomial.Polynomial have a domain and window? they appear
to have no implementation (except the chebyshev interpolation) as far as I
can tell? what is their intended use? why is their no variable
representation implemented like in np.polynomial and is not installed to
numpy directly, @set_module('numpy')?
3) how many improvements are allowed to be done all at once? is there an
expectation of waiting before improving implementations or code?
Thinking further down the road, how much is too much to add to numpy? I
would like to see multivariate implemented, but after that it would be nice
to have a collection of known polynomials like standard, pochhammer,
cyclotomic, hermite, lauguerre, legendre, chebyshev, jacobi. After that it
would be of interest to extend .diff() to really be a differential operator
(implemented as sequences of polynomials) and to allow differential
constructions. Then things like this would be possible:
x = np.poly1d([1,0])
D = np.diff1d([1,0])
assert Hermite(78) == (2*x  D)(Hermite(77))
Or if series were implemented then we would see things like:
assert Hermite(189) == (1)**189 * Exp(x**2)*(D**189)(Exp(x**2))
assert Hermite(189) == Exp(D**2)(x**189)(2*x)
4) at what point does numpy polynomial material belong in scipy material and
visaversa? is numpy considered the backend to scipy or visaversa? at times
it seems like things are double implemented in numpy and scipy, but in
general should not implement things in numpy that already exist in scipy?
Thanks,
Robert

Sent from: http://numpydiscussion.10968.n7.nabble.com/
1. Is there a technical reason for `choose` not accept a `dtype` argument?
2. Separately, mypy is unhappy with my 2nd argument to `choose`:
Argument 2 to "choose" has incompatible type "Tuple[int, Sequence[float]]";
expected "Union[Union[int, float, complex, str, bytes, generic], Sequence[Union[int, float, complex, str, bytes, generic]], Sequence[Sequence[Any]],_SupportsArray]"
However, `choose` is happy to have e.g. `choices=(0,seq)` (and I hope it will remain so!).
E.g.,
a = a = (0,1,1,0,0,0,1,1) #binary array
np.choose((0,range(8)) #array([0, 1, 2, 0, 0, 0, 6, 7])
Thanks, Alan Isaac
Hello everyone!
I am Mukulika Pahari, a Computer Engineering student from India. I wanted
to implement the following ideas for Google Summer of Docs 2021 if given
the opportunity. I have referred to both the GSoD proposal
<https://github.com/numpy/numpy/wiki/GoogleSeasonofDocs2021ProjectIdea…>
and the NEP 44
<https://numpy.org/neps/nep0044restructuringnumpydocs.html> proposal.
Please give me feedback about the ideas!
1.
Reorganising contents of the documentation into Reference Guide,
HowTos, Tutorials and Explanations as per structure proposed in NEP 44
<https://numpy.org/neps/nep0044restructuringnumpydocs.html>.

Auditing existing documentation

Clearing misplaced content for example Explanations in Reference Guide,
HowTos in Explanations etc.

Establishing distinct Reference, HowTos, Tutorials and Explanations
sections with crosslinking where required
1.
Reorganising landing page of NumPy docs <https://numpy.org/devdocs/>.

Moving the documentation structure to the left sidebar

Having NumPy Quickstart as the first thing people see when they land on
Documentation
1.
Writing new musthave tutorials (based on mostsearched tutorials on
Google).

“How to write a tutorial” guide

3 Beginner Tutorials

3 Intermediate Tutorials

3 Advanced Tutorials
1.
Writing HowTos based on mostused functions, most asked doubts on
StackOverflow, etc.
2.
Revamping the User Guide.

Updating outofdate references and refactoring content to the latest
best practices

Adding nontextual images or graphics to enhance the textual explanations

Removing duplication to improve searchability
I have proposed this work keeping in mind 30ish weeks of work including
a few weeks for becoming familiar with the organisation and ironing out
details like exactly which tutorials and howtos to write. Please let me
know if I can aim to achieve more in the timeframe.
My experience with NumPy is currently limited to one data analysis project
but I would love to learn more about its applications while restructuring
and developing its docs!
Thank you for your time.
Hi all,
I recently noted that absolute values of complex arrays are very slow
(about a factor of five compared to some straightforward
implementations). I would like to understand the origin, but could not
trace the code.
Please, consider these timings:
In [67]: z = np.random.random(10000) + 1j*np.random.random(10000)
In [68]: %timeit np.abs(z)**2
215 µs ± 2.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [69]: %timeit ((np.sqrt((z.real**2 + z.imag**2)))**2)
78.2 µs ± 193 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [70]: %timeit (z.real**2 + z.imag**2)
40.1 µs ± 196 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [71]: %timeit (z.conjugate()*z).real
43.7 µs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Even considering the square root and/or additional square, does not
account for the time spent
In [72]: %timeit np.abs(z)
206 µs ± 970 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [86]: %timeit np.sqrt(z.real**2 + z.imag**2)
70.1 µs ± 303 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [73]: %timeit np.sqrt((z.conjugate()*z).real)
105 µs ± 2.23 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I could not follow the code to understand what calculations are taking
place for `abs()`. Any ideas?
Regards, Juan
Hi All,
https://numpy.org/devdocs/reference/random/capi.html has an inaccurate
description of the API that goes as follows:
"/zig in the name are based on a ziggurat lookup algorithm is used instead
of calculating the log, which is significantly faster. The nonziggurat
variants are used in corner cases and for legacy compatibility./". It
appears this is no longer the case and instead there are functions that have
`inv` in the name to signal the use of the inverse method for sampling. I
submited a PR to reflect this and also added the missing function signatures
at https://github.com/numpy/numpy/pull/18797/files
Regards

Sent from: http://numpydiscussion.10968.n7.nabble.com/