The same idea was published two years ago:

http://hiperfit.dk/pdf/Doubling.pdf


On Mon, Oct 3, 2016 at 8:53 AM, <numpy-discussion-request@scipy.org> wrote:
Send NumPy-Discussion mailing list submissions to
        numpy-discussion@scipy.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://mail.scipy.org/mailman/listinfo/numpy-discussion
or, via email, send a message with subject or body 'help' to
        numpy-discussion-request@scipy.org

You can reach the person managing the list at
        numpy-discussion-owner@scipy.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of NumPy-Discussion digest..."


Today's Topics:

   1. Re: automatically avoiding temporary arrays (Benjamin Root)
   2. Re: Dropping sourceforge for releases. (David Cournapeau)
   3. Re: Dropping sourceforge for releases. (Vincent Davis)


----------------------------------------------------------------------

Message: 1
Date: Sun, 2 Oct 2016 09:10:45 -0400
From: Benjamin Root <ben.v.root@gmail.com>
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] automatically avoiding temporary
        arrays
Message-ID:
        <CANNq6Fn9eGOnrSGz8Duo8J9oTe3N6xTCSqDDP7nRyNqjFKpAjQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Just thinking aloud, an idea I had recently takes a different approach. The
problem with temporaries isn't so much that they exists, but rather they
they keep on malloc'ed and cleared. What if numpy kept a small LRU cache of
weakref'ed temporaries? Whenever a new numpy array is requested, numpy
could see if there is already one in its cache of matching size and use it.
If you think about it, expressions that result in many temporaries would
quite likely have many of them being the same size in memory.

Don't know how feasible it would be to implement though.

Cheers!
Ben Root


On Sat, Oct 1, 2016 at 2:38 PM, Chris Barker <chris.barker@noaa.gov> wrote:

> Julian,
>
> This is really, really cool!
>
> I have been wanting something like this for years (over a decade? wow!),
> but always thought it would require hacking the interpreter to intercept
> operations. This is a really inspired idea, and could buy numpy a lot of
> performance.
>
> I'm afraid I can't say much about the implementation details -- but great
> work!
>
> -Chris
>
>
>
>
> On Fri, Sep 30, 2016 at 2:50 PM, Julian Taylor <
> jtaylor.debian@googlemail.com> wrote:
>
>> On 30.09.2016 23:09, josef.pktd@gmail.com wrote:
>> > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor
>> > <jtaylor.debian@googlemail.com> wrote:
>> >> hi,
>> >> Temporary arrays generated in expressions are expensive as the imply
>> >> extra memory bandwidth which is the bottleneck in most numpy
>> operations.
>> >> For example:
>> >>
>> >> r = a + b + c
>> >>
>> >> creates the b + c temporary and then adds a to it.
>> >> This can be rewritten to be more efficient using inplace operations:
>> >>
>> >> r = b + c
>> >> r += a
>> >
>> > general question (I wouldn't understand the details even if I looked.)
>> >
>> > how is this affected by broadcasting and type promotion?
>> >
>> > Some of the main reasons that I don't like to use inplace operation in
>> > general is that I'm often not sure when type promotion occurs and when
>> > arrays expand during broadcasting.
>> >
>> > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape.
>> > another case when I switch away from broadcasting is when b + c is int
>> > or bool and a is float. Thankfully, we get error messages for casting
>> > now.
>>
>> the temporary is only avoided when the casting follows the safe rule, so
>> it should be the same as what you get without inplace operations. E.g.
>> float32-temporary + float64 will not be converted to the unsafe float32
>> += float64 which a normal inplace operations would allow. But
>> float64-temp + float32 is transformed.
>>
>> Currently the only broadcasting that will be transformed is temporary +
>> scalar value, otherwise it will only work on matching array sizes.
>> Though there is not really anything that prevents full broadcasting but
>> its not implemented yet in the PR.
>>
>> >
>> >>
>> >> This saves some memory bandwidth and can speedup the operation by 50%
>> >> for very large arrays or even more if the inplace operation allows it
>> to
>> >> be completed completely in the cpu cache.
>> >
>> > I didn't realize the difference can be so large. That would make
>> > streamlining some code worth the effort.
>> >
>> > Josef
>> >
>> >
>> >>
>> >> The problem is that inplace operations are a lot less readable so they
>> >> are often only used in well optimized code. But due to pythons
>> >> refcounting semantics we can actually do some inplace conversions
>> >> transparently.
>> >> If an operand in python has a reference count of one it must be a
>> >> temporary so we can use it as the destination array. CPython itself
>> does
>> >> this optimization for string concatenations.
>> >>
>> >> In numpy we have the issue that we can be called from the C-API
>> directly
>> >> where the reference count may be one for other reasons.
>> >> To solve this we can check the backtrace until the python frame
>> >> evaluation function. If there are only numpy and python functions in
>> >> between that and our entry point we should be able to elide the
>> temporary.
>> >>
>> >> This PR implements this:
>> >> https://github.com/numpy/numpy/pull/7997
>> >>
>> >> It currently only supports Linux with glibc (which has reliable
>> >> backtraces via unwinding) and maybe MacOS depending on how good their
>> >> backtrace is. On windows the backtrace APIs are different and I don't
>> >> know them but in theory it could also be done there.
>> >>
>> >> A problem is that checking the backtrace is quite expensive, so should
>> >> only be enabled when the involved arrays are large enough for it to be
>> >> worthwhile. In my testing this seems to be around 180-300KiB sized
>> >> arrays, basically where they start spilling out of the CPU L2 cache.
>> >>
>> >> I made a little crappy benchmark script to test this cutoff in this
>> branch:
>> >> https://github.com/juliantaylor/numpy/tree/elide-bench
>> >>
>> >> If you are interested you can run it with:
>> >> python setup.py build_ext -j 4 --inplace
>> >> ipython --profile=null check.ipy
>> >>
>> >> At the end it will plot the ratio between elided and non-elided
>> runtime.
>> >> It should get larger than one around 180KiB on most cpus.
>> >>
>> >> If no one points out some flaw in the approach, I'm hoping to get this
>> >> into the next numpy version.
>> >>
>> >> cheers,
>> >> Julian
>> >>
>> >>
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion@scipy.org
>> >> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> > _______________________________________________
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker@noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.scipy.org/pipermail/numpy-discussion/attachments/20161002/2e65258f/attachment-0001.html>

------------------------------

Message: 2
Date: Sun, 2 Oct 2016 22:26:28 +0100
From: David Cournapeau <cournape@gmail.com>
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Dropping sourceforge for releases.
Message-ID:
        <CAGY4rcUzVtnbhbJ542Vjrx3T8-ffO-jhiw06YrMqhk3ezae6XA@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

+1 from me.

If we really need some distribution on top of github/pypi, note that
bintray (https://bintray.com/) is free for OSS projects, and is a much
better experience than sourceforge.

David

On Sun, Oct 2, 2016 at 12:02 AM, Charles R Harris <charlesr.harris@gmail.com
> wrote:

> Hi All,
>
> Ralf has suggested dropping sourceforge as a NumPy release site. There was
> discussion of doing that some time back but we have not yet done it. Now
> that we put wheels up on PyPI for all supported architectures source forge
> is not needed. I note that there are still some 15,000 downloads a week
> from the site, so it is still used.
>
> Thoughts?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.scipy.org/pipermail/numpy-discussion/attachments/20161002/4e462a48/attachment-0001.html>

------------------------------

Message: 3
Date: Sun, 2 Oct 2016 17:53:32 -0600
From: Vincent Davis <vincent@vincentdavis.net>
To: Discussion of Numerical Python <numpy-discussion@scipy.org>
Subject: Re: [Numpy-discussion] Dropping sourceforge for releases.
Message-ID:
        <CALyJZZX=KfKrsOh2QHZahwxW_p0sYvFCPpBVFXU0RNt7s8J4XQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

+1, I am very skeptical of anything on SourceForge, it negatively impacts
my opinion of any project that requires me to download from sourceforge.

On Saturday, October 1, 2016, Charles R Harris <charlesr.harris@gmail.com>
wrote:

> Hi All,
>
> Ralf has suggested dropping sourceforge as a NumPy release site. There was
> discussion of doing that some time back but we have not yet done it. Now
> that we put wheels up on PyPI for all supported architectures source forge
> is not needed. I note that there are still some 15,000 downloads a week
> from the site, so it is still used.
>
> Thoughts?
>
> Chuck
>


--
Sent from mobile app.
Vincent Davis
720-301-3003
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.scipy.org/pipermail/numpy-discussion/attachments/20161002/eb4cbff3/attachment.html>

------------------------------

Subject: Digest Footer

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


------------------------------

End of NumPy-Discussion Digest, Vol 121, Issue 3
************************************************