[Numpy-discussion] NumPy re-factoring project

Sat Jun 12 16:01:30 EDT 2010

2010/6/12 Charles R Harris <charlesr.harris at gmail.com>

>
> This is more the way I see things, except I would divide the bottom layer
> into two parts, views and memory. The memory can come from many places --
> memmaps, user supplied buffers, etc. -- but we should provide a simple
> reference counted allocator for the default. The views correspond more to
> PEP 3118 and simply provide data types, dimensions, and strides, much as
> arrays do now. However, I would confine the data types to those available in
> C with a bit extra information as to precision, because.  Object arrays
> would be a special case of pointer arrays (void pointer arrays?) and
> structured arrays/Unicode might be a special case of char arrays. The more
> complicated dtypes would then be built on top of those. Some things just
> won't be portable, pointers in particular, but such is life.
>

Well, if "data block" of ndarray is going to be refactored that much, it
might be an interesting idea if it would support transparent compression.
As I see the things, it is clear that the gap between CPU power and memory
bandwith is widening and that trend will continue for long time.  This means
that having the possibility to deal with compressed data transparently,
would impact not only in a less memory consumption, but also in improved
memory access.

Getting improved memory access by using compression may sound a bit crazy,
but it is not.  For example, the Blosc [1]  project (which is undergoing the
latests testing steps before becoming stable) is showing signs that, by
using multimedia extensions in CPUs and multi-threading techniques, this is
becoming possible (as shown in [2]).

Of course, this optimization may have not much sense for accelerating
computations in NumPy if it cannot get rid of the temporary variables
problem (which I hope this can be resolved some day), but still, it could be
useful improving performance with other features (like copying, broadcasting
or performing selections more efficiently).

[1] http://blosc.pytables.org/
[2] http://blosc.pytables.org/trac/wiki/SyntheticBenchmarks

Just another wish into the bag ;-)

-- 
Francesc Alted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100612/961ebfa1/attachment.html>