[Tutor] Python oddity

jim stockford jim at well.com
Thu Feb 28 20:07:29 CET 2008


i'd like to know, too. my take so far is

* don't make any copies if you can avoid doing so,
* make shallow copies if need be,
* make deep copies only if you can't think of any
other way to accomplish what you're up to.

what's the truth? I'm hoping there's an OTW answer
(OTW ~> "One True Way").


On Feb 28, 2008, at 10:03 AM, Keith Suda-Cederquist wrote:

> Hey all, thanks for all the responses.  I think I get the gist of just 
> about everything.  One thing to mention though is that I'm mainly 
> working with scipy so most of what I'm working with in 
> numpy.ndarrays.  I think most of what was mentioned with respect to 
> lists applies equally to numpy.ndarrays.  The id() 
> function/sub-routine is something that'll be handy in the future as is 
> knowing the difference between 'is' and '= ='.
>
> I have a few follow-up questions:
>
> 1)  A lot of what I'm doing is image processing that involves slicing 
> and performing standard arithmetic and logic to arrays.  So I think 
> I've manage to avoid this trap most of the time since at least shallow 
> copies  are made. Are there any clear rules for when a shallow copy is 
> made versus a deep copy?
> 2)  Luke gave a tip on using the append rather than the += when 
> creating strings in a for-loop to be more memory-efficient.  Does this 
> only apply to immutable objects such as strings??  I'm creating a lot 
> of arrays by doing something like this:
>
> A=scipy.zeros(100,100)  #initialize array
> for ii in xrange(0,shape(A)[0]):
>     for jj in xrange(0,shape(A)[1]):
>        A[ii,jj]=3*ii+jj+sqrt(ii*jj)
>
> I would think this is okay since the array is mutable, am I correct or 
> is there a better technique?     
> 3)  Now you all have me a little worried that I'm not being very 
> memory efficient with my programming style.  A lot of the time, I'll 
> perform a lot of operations in sequence on an array.  For example:
>
> #B is a 1000x1000 pixel image that I've imported as an array.  We'll 
> call this B1
> B=(B-B.min())/(B.max()-B.min())  #image array scaled from 0 to 1.  
> We'll call this B2
> B=B[B>0.5]  #image with values less than 0.5 masked (set to zero).  
> We'll call this B3
>
> So in this example, is memory allocated for B1, B2 and B3?  Does every 
> operation force a new copy?   Will this memory ever be 'recovered' by 
> the python memory manger (garbage collector?)?
> 4)  The typical knee-jerk reaction to this 'oddity' is "what a pain, 
> how stupid" etc, but I'm sure there is a good explanation.  Can 
> someone explain why python acts this way?  faster processing?  
> preserve memory? etc?
>
> Thanks for all your help.
>
> -Keith
> ----- Original Message ----
> From: Luke Paireepinart <rabidpoobear at gmail.com>
> To: Brett Wilkins <lupin at orcon.net.nz>; Tutor <tutor at python.org>
> Sent: Thursday, February 28, 2008 2:49:11 AM
> Subject: Re: [Tutor] Python oddity
>
>  Brett Wilkins wrote:
> > As everybody else has told you, assigning bb = aa just gives bb the
> > reference to the same object that aa has. Unless I missed something,
> > then nobody's actually mentioned how to make this not happen... and 
> it's
> > actually rather easy... instead of bb = aa, do this:
> > bb = aa[:]
> > Looks like a splice, and indeed it is a splice, without bounds. When 
> a
> > splice is used like this, I believe it is known as the copy 
> directive.
> Yes, Terry explained other ways as well.
> The problem with using slices is the following:
>
>  >>> a = [['a'],'b']
>  >>> b = a
>  >>> b is a
> True
>  >>> b = a[:]
>  >>> b is a
> False
> #okay, we're fine so far.
>  >>> b[0]
> ['a']
>  >>> b[0].append('c')
>  >>> b[0]
> ['a', 'c']
>  >>> a[0]
> ['a', 'c']
>
>
> and here's where we run into problems.
> Turns out the slice notation just makes a shallow copy, so if you have
> any lists in your list, they will still refer to the same object.  (I'm
> guessing that any mutable objects will still refer to the same object)
> copy.copy() has this same problem.
> The way to avoid this, as I mentioned before, is to use a deepcopy(),
> but as I said, you usually don't need to make a full copy of a list.
> Oh, and if you're curious:
>  >>> b = list(a)
>  >>> b is a
> False
>  >>> b[0].append('d')
>  >>> a[0]
> ['a', 'c', 'd']
>
> So a list() does a shallow copy as well.
>
> Here's a continuation, with deepcopy():
>  >>> import copy
>  >>> b = copy.deepcopy(a)
>  >>> b is a
> False
>  >>> b[0] is a[0]
> False
>  >>> b[0].append('e')
>  >>> a[0]
> ['a', 'c', 'd']
>
>
> HTH,
> -Luke
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
>
> Never miss a thing. Make Yahoo your homepage. 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor



More information about the Tutor mailing list