
On Thu, 2022-01-20 at 14:41 +0100, Francesc Alted wrote:
On Wed, Jan 19, 2022 at 7:48 PM Francesc Alted <faltet@gmail.com> wrote:
On Wed, Jan 19, 2022 at 6:58 PM Stanley Seibert <sseibert@anaconda.com> wrote:
Given that this seems to be Linux only, is this related to how glibc does large allocations (>128kB) using mmap()?
That's a good point. As MMAP_THRESHOLD is 128 KB, and the size of `z` is almost 4 MB, mmap machinery is probably getting involved here. Also, as pages acquired via anonymous mmap are not actually allocated until you access them the first time, that would explain that the first access is slow. What puzzles me is that the timeit loops access `z` data 3*10000 times, which is plenty of time for doing the allocation (just should require just a single iteration).
I think I have more evidence that what is happening here has to see of how the malloc mechanism works in Linux. I find the next explanation to be really good:
Thanks for figuring this out! It has been bugging me a lot before. So it rather depends on how `malloc` works, and not the kernel. It is surprising how "random" this can look, but I suppose some examples just happen to sit almost exactly at the threshold. It might be interesting if we could tweak `mallopt` parameters for typical NumPy usage. But unless it is very clear, maybe a standalone module is better?
In addition, this excerpt of the mallopt manpage ( https://man7.org/linux/man-pages/man3/mallopt.3.html) is very significant:
<snip>
All in all, this is testimonial of how much memory handling can affect performance in modern computers. Perhaps it is time for testing different memory allocation strategies in NumPy and come up with suggestions for users.
You are probably aware, but Matti and Elias now added the ability to customize array data allocation in NumPy, so it should be straight forward to write a small package/module that tweaks the allocation strategy here. Cheers, Sebastian
Francesc
On Wed, Jan 19, 2022 at 9:06 AM Sebastian Berg < sebastian@sipsolutions.net> wrote:
On Wed, 2022-01-19 at 11:49 +0100, Francesc Alted wrote:
On Wed, Jan 19, 2022 at 7:33 AM Stefan van der Walt <stefanv@berkeley.edu> wrote:
On Tue, Jan 18, 2022, at 21:55, Warren Weckesser wrote: > expr = 'z.real**2 + z.imag**2' > > z = generate_sample(n, rng)
🤔 If I duplicate the `z = ...` line, I get the fast result throughout. If, however, I use `generate_sample(1, rng)` (or any other value than `n`), it does not improve matters.
Could this be a memory caching issue?
Yes, it is a caching issue for sure. We have seen similar random fluctuations before. You can proof that it is a cache page-fault issue by running it with `perf --stat`. I did this twice, once with the second loop removed (page-faults only):
28333629 page-faults # 936.234 K/sec 28362718 page-faults # 1.147 M/sec
The number of page faults is low. Running only the second one (or running the first one only once, rather), gave me:
15024 page-faults # 1.837 K/sec
So that is the *reason*. I had before tried to figure out why the page faults differ too much, or if we can do something about it. But I never had any reasonable lead on it.
In general, these fluctuations are pretty random, in the sense that unrelated code changes and recompilation can swap the behaviour easily. As Andras noted in that he sees the opposite.
I would love to have an idea if there is a way to figure out why the page-faults are so imbalanced between the two.
(I have not looked at CPU cache misses this time, but since page-faults happen, I assume that should not matter?)
Cheers,
Sebastian
I can also reproduce that, but only on my Linux boxes. My MacMini does not notice the difference.
Interestingly enough, you don't even need an additional call to `generate_sample(n, rng)`. If one use `z = np.empty(...)` and then do an assignment, like:
z = np.empty(n, dtype=np.complex128) z[:] = generate_sample(n, rng)
then everything runs at the same speed:
numpy version 1.20.3
142.3667 microseconds 142.3717 microseconds 142.3781 microseconds
142.7593 microseconds 142.3579 microseconds 142.3231 microseconds
As another data point, by doing the same operation but using numexpr I am not seeing any difference either, not even on Linux:
numpy version 1.20.3 numexpr version 2.8.1
95.6513 microseconds 88.1804 microseconds 97.1322 microseconds
105.0833 microseconds 100.5555 microseconds 100.5654 microseconds
[it is rather like a bit the other way around, the second iteration seems a hair faster] See the numexpr script below.
I am totally puzzled here.
""" import timeit import numpy as np import numexpr as ne
def generate_sample(n, rng): return rng.normal(scale=1000, size=2*n).view(np.complex128)
print(f'numpy version {np.__version__}') print(f'numexpr version {ne.__version__}') print()
rng = np.random.default_rng() n = 250000 timeit_reps = 10000
expr = 'ne.evaluate("zreal**2 + zimag**2")'
z = generate_sample(n, rng) zreal = z.real zimag = z.imag for _ in range(3): t = timeit.timeit(expr, globals=globals(), number=timeit_reps) print(f"{1e6*t/timeit_reps:9.4f} microseconds") print()
z = generate_sample(n, rng) zreal = z.real zimag = z.imag for _ in range(3): t = timeit.timeit(expr, globals=globals(), number=timeit_reps) print(f"{1e6*t/timeit_reps:9.4f} microseconds") print() """
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sseibert@anaconda.com
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: faltet@gmail.com
-- Francesc Alted
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net