[Numpy-discussion] copy on demand

Alexander Schmolck a.schmolck at gmx.net
Wed Jun 12 15:51:04 EDT 2002

"Perry Greenfield" <perry at stsci.edu> writes:

> <Rick White writes> :
> > This kind of state information with side effects leads to a system that
> > is hard to develop, hard to debug, and really messes up the behavior of
> > the program (IMHO).  It is *highly* desirable to avoid it if possible.
> > 
> Rick beat me to the punch. The requirement for copy-on-demand
> definitely leads to a far more complex implementation with 
> much more potential for misunderstood memory usage. You could
> do one small thing and suddenly force a spate of copies (perhaps
> cascading). There is no way we would taken on a redesign of 

Yes, but I would suspect that cases were a little innocuous a[0] = 3 triggers
excessive processing should be rather unusual (matlab or octave users will know).

> Numeric with this requirement with the resources we have available.

Fair enough -- if implementing copy-on-demand is too much work then we'll have
to live without it (especially if view-slicing doesn't stand in the way of a
future inclusion into the python core).

I guess the best reason to bite the bullet and carry around state information
would be if there were significant other cases where one also would want to
optimize operations under the hood. If there isn't much else in this direction
then the effort involved might not be justified. One thing that bugs me in
Numeric (and that might already have been solved in numarray) is that
e.g. ``ravel`` (and I think also ``transpose``) creates unnecessary copies,
whereas ``.flat`` doesn't, but won't work in all cases (viz. when the array is
non-contiguous), so I can either have ugly or inefficient code.

> > This is not to deny that copy-on-demand (with explicit views available
> > on request) would have some desirable advantages for the behavior of
> > the system.  But we've worried these issues to death, and in the end
> > were convinced that slices == views provided the best compromise
> > between the desired behavior and a clean implementation.
> > 
> Rick's explanation doesn't really address the other position which
> is slices should force immediate copies. This isn't a difficult
> implementation issue by itself. But it does raise some related
> implementation questions. Supposing one does feel that views are
> a feature one wants even though they are not the default, it turns
> out that it isn't all that simple to obtain views without sacrificing
> ordinary slicing syntax to obtain a view. It is simple to obtain
> copies of view slices though.

I'm not sure I understand the above.  What is the problem with ``a.view[1:3]``

> Slicing views may not be important to everyone. It is important
> to us (and others) and we do see a number of situations where
> forcing copies to operate on array subsets would be a serious
> performance problem. We did discuss this issue with Guido and

Sure, no one denies that even if with copy-on-demand (explicitly) aliased
views would still be useful.

> he did not  indicate that having different behavior on slicing
> with arrays would be a show stopper for acceptance into the
> Standard Library. We are also aware that there is no great
> consensus on this issue (even internally at STScI :-).

Yep, I just saw Paul Barrett's post :)

> Perry Greenfield

Alexander Schmolck     Postgraduate Research Student
                       Department of Computer Science
                       University of Exeter
A.Schmolck at gmx.net     http://www.dcs.ex.ac.uk/people/aschmolc/

More information about the NumPy-Discussion mailing list