[Numpy-discussion] The mu.py script will keep running and never end.

Evgeni Burovski evgeny.burovskiy at gmail.com
Sun Oct 11 02:55:26 EDT 2020


The script seems to be computing the particle numbers for an array of
chemical potentials.

Two ways of speeding it up, both are likely simpler then using dask:

First: use numpy

1. Move constructing mu_all out of the loop (np.linspace)
2. Arrange the integrands into a 2d array
3. np.trapz along an axis which corresponds to a single integrand array
(Or avoid the overhead of trapz by just implementing the trapezoid formula
manually)

Second:

Move the loop into cython.




вс, 11 окт. 2020 г., 9:32 Hongyi Zhao <hongyi.zhao at gmail.com>:

> On Sun, Oct 11, 2020 at 2:02 PM Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
> >
> >
> >
> > On Sun, 11 Oct 2020 at 07.52, Hongyi Zhao <hongyi.zhao at gmail.com> wrote:
> >>
> >> On Sun, Oct 11, 2020 at 1:33 PM Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
> >> >
> >> >
> >> >
> >> > On Sun, 11 Oct 2020 at 07.14, Andrea Gavana <andrea.gavana at gmail.com>
> wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> On Sun, 11 Oct 2020 at 00.27, Hongyi Zhao <hongyi.zhao at gmail.com>
> wrote:
> >> >>>
> >> >>> On Sun, Oct 11, 2020 at 1:48 AM Robert Kern <robert.kern at gmail.com>
> wrote:
> >> >>> >
> >> >>> > You don't need to use vectorize() on fermi(). fermi() will work
> just fine on arrays and should be much faster.
> >> >>>
> >> >>> Yes, it really does the trick. See the following for the benchmark
> >> >>> based on your suggestion:
> >> >>>
> >> >>> $ time python mu.py
> >> >>> [-10.999 -10.999 -10.999 ...  20.     20.     20.   ] [4.973e-84
> >> >>> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >> >>>
> >> >>> real    0m41.056s
> >> >>> user    0m43.970s
> >> >>> sys    0m3.813s
> >> >>>
> >> >>>
> >> >>> But are there any ways to further improve/increase efficiency?
> >> >>
> >> >>
> >> >>
> >> >> I believe it will get a bit better if you don’t column_stack an
> array 6000 times - maybe pre-allocate your output first?
> >> >>
> >> >> Andrea.
> >> >
> >> >
> >> >
> >> > I’m sorry, scratch that: I’ve seen a ghost white space in front of
> your column_stack call and made me think you were stacking your results
> very many times, which is not the case.
> >>
> >> Still not so clear on your solutions for this problem. Could you
> >> please post here the corresponding snippet of your enhancement?
> >
> >
> > I have no solution, I originally thought you were calling “column_stack”
> 6000 times in the loop, but that is not the case, I was mistaken. My
> apologies for that.
> >
> > The timings of your approach is highly dependent on the size of your
> “energy” and “DOS” array -
>
> The size of the “energy” and “DOS” array is Problem-related and
> shouldn't be reduced arbitrarily.
>
> > not to mention calling trapz 6000 times in a loop.
>
> I'm currently thinking on parallelization the execution of the for
> loop, say, with joblib <https://github.com/joblib/joblib>, but I still
> haven't figured out the corresponding codes. If you have some
> experience on this type of solution, could you please give me some
> more hints?
>
> >  Maybe there’s a better way to do it with another approach, but at the
> moment I can’t think of one...
> >
> >>
> >>
> >> Regards,
> >> HY
> >> >
> >> >>
> >> >>
> >> >>>
> >> >>>
> >> >>> Regards,
> >> >>> HY
> >> >>>
> >> >>> >
> >> >>> > On Sat, Oct 10, 2020, 8:23 AM Hongyi Zhao <hongyi.zhao at gmail.com>
> wrote:
> >> >>> >>
> >> >>> >> Hi,
> >> >>> >>
> >> >>> >> My environment is Ubuntu 20.04 and python 3.8.3 managed by
> pyenv. I
> >> >>> >> try to run the script
> >> >>> >> <
> https://notebook.rcc.uchicago.edu/files/acs.chemmater.9b05047/Data/bulk/dft/mu.py
> >,
> >> >>> >> but it will keep running and never end. When I use 'Ctrl + c' to
> >> >>> >> terminate it, it will give the following output:
> >> >>> >>
> >> >>> >> $ python mu.py
> >> >>> >> [-10.999 -10.999 -10.999 ...  20.     20.     20.   ] [4.973e-84
> >> >>> >> 4.973e-84 4.973e-84 ... 4.973e-84 4.973e-84 4.973e-84]
> >> >>> >>
> >> >>> >> I have to terminate it and obtained the following information:
> >> >>> >>
> >> >>> >> ^CTraceback (most recent call last):
> >> >>> >>   File "mu.py", line 38, in <module>
> >> >>> >>     integrand=DOS*fermi_array(energy,mu,kT)
> >> >>> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >> >>> >> line 2108, in __call__
> >> >>> >>     return self._vectorize_call(func=func, args=vargs)
> >> >>> >>   File
> "/home/werner/.pyenv/versions/datasci/lib/python3.8/site-packages/numpy/lib/function_base.py",
> >> >>> >> line 2192, in _vectorize_call
> >> >>> >>     outputs = ufunc(*inputs)
> >> >>> >>   File "mu.py", line 8, in fermi
> >> >>> >>     return 1./(exp((E-mu)/kT)+1)
> >> >>> >> KeyboardInterrupt
> >> >>> >>
> >> >>> >>
> >> >>> >> Any helps and hints for this problem will be highly appreciated?
> >> >>> >>
> >> >>> >> Regards,
> >> >>> >> --
> >> >>> >> Hongyi Zhao <hongyi.zhao at gmail.com>
> >> >>> >> _______________________________________________
> >> >>> >> NumPy-Discussion mailing list
> >> >>> >> NumPy-Discussion at python.org
> >> >>> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >> >>> >
> >> >>> > _______________________________________________
> >> >>> > NumPy-Discussion mailing list
> >> >>> > NumPy-Discussion at python.org
> >> >>> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Hongyi Zhao <hongyi.zhao at gmail.com>
> >> >>> _______________________________________________
> >> >>> NumPy-Discussion mailing list
> >> >>> NumPy-Discussion at python.org
> >> >>> https://mail.python.org/mailman/listinfo/numpy-discussion
> >> >
> >> > _______________________________________________
> >> > NumPy-Discussion mailing list
> >> > NumPy-Discussion at python.org
> >> > https://mail.python.org/mailman/listinfo/numpy-discussion
> >>
> >>
> >>
> >> --
> >> Hongyi Zhao <hongyi.zhao at gmail.com>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at python.org
> >> https://mail.python.org/mailman/listinfo/numpy-discussion
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion
>
>
>
> --
> Hongyi Zhao <hongyi.zhao at gmail.com>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20201011/88c60d81/attachment-0001.html>


More information about the NumPy-Discussion mailing list