what about another method clone() (or copy())? i think this maybe useful either.
------------------ Original ------------------ From: "average"dreamingforward@gmail.com; Date: Mon, Feb 8, 2010 09:14 AM To: "Gerald Britton"gerald.britton@gmail.com; Cc: "Python-Ideas"python-ideas@python.org; Subject: Re: [Python-ideas] clear() method for lists
On Fri, Feb 5, 2010 at 1:39 PM, Gerald Britton gerald.britton@gmail.com wrote:
In the list archives, this thread discusses adding a clear() method to list objects, to complement those available for sets and dictionaries. Later in the thread: Christian Heimes provided a patch to do it and R. H. commented that all it would take is Guido's blessing.
So, I'm wondering, can we do this? What are the steps needed to ask this work to be blessed?
In the abstract it seems like such a method should be part of the Container ABC. Since the idea of a container would imply a method to clear its contents.
mark _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Wed, Feb 10, 2010 at 9:12 AM, Mathias Panzenböck grosser.meister.morti@gmx.net wrote:
On 02/10/2010 11:39 AM, wxyarv wrote:
what about another method clone() (or copy())? i think this maybe useful either.
l1 = [1, 2, 3] l2 = l1[:] l3 = list(l1) _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Yes, I plan to ask for copy() as well, when the bug tracker opens up for 3.3, 3.4, etc. The issue is not, "Is there already a way to do this?" but rather, "Can we have consistent interfaces in the sequence types and collections where possible and appropriate?"
dict() and set() already support both clear() and copy() methods. Previous posters have pointed to the disconnect and showed the problem of having to test if a given iterable supports the clear() method before calling it, in functions that take any iterable.
Also, for what it's worth:
s1 = set() s2 = s1.copy()
is faster than
s1 = set() s2 = set(s1)
(and also for dict()) probably because the first is specifically-written for the copy operation whereas the second actually iterates over s1, one item at a time. (At least I think that's what's going on). I suppose that a list().copy() method might also be faster than the other two approaches to copy a list.
Lastly, for completeness, I suppose copy() might be appropriate for both tuple and deque as well.
On Feb 10, 2010, at 6:54 AM, Gerald Britton wrote:
Yes, I plan to ask for copy() as well, when the bug tracker opens up for 3.3, 3.4, etc. The issue is not, "Is there already a way to do this?" but rather, "Can we have consistent interfaces in the sequence types and collections where possible and appropriate?"
Use the copy module.
dict() and set() already support both clear() and copy() methods. Previous posters have pointed to the disconnect and showed the problem of having to test if a given iterable supports the clear() method before calling it, in functions that take any iterable.
Also, for what it's worth:
s1 = set() s2 = s1.copy()
is faster than
s1 = set() s2 = set(s1)
I question your timing skills. Both call the same routine to do the work of copying entries:
set_copy() calls make_new_set() which calls set_update_internal() set_init() calls set_update_internal()
If there is any difference at all, it is the constant overhead of passing an argument to set(), not the implementation itself. The actual set building work is the same.
(and also for dict()) probably because the first is specifically-written for the copy operation whereas the second actually iterates over s1, one item at a time. (At least I think that's what's going on). I suppose that a list().copy() method might also be faster than the other two approaches to copy a list.
You need to read some code, learn about ref counts, etc. There's more to list copying than a memcpy(). If list.copy() were added, it would use that same underlying code as list(s) and s[:]. There would be no speed-up.
Lastly, for completeness, I suppose copy() might be appropriate for both tuple and deque as well.
Tuples? Really? An immutable collection is its own copy.
Raymond
On Wed, Feb 10, 2010 at 12:10 PM, Raymond Hettinger raymond.hettinger@gmail.com wrote:
On Feb 10, 2010, at 6:54 AM, Gerald Britton wrote:
Yes, I plan to ask for copy() as well, when the bug tracker opens up for 3.3, 3.4, etc. The issue is not, "Is there already a way to do this?" but rather, "Can we have consistent interfaces in the sequence types and collections where possible and appropriate?"
Use the copy module.
dict() and set() already support both clear() and copy() methods. Previous posters have pointed to the disconnect and showed the problem of having to test if a given iterable supports the clear() method before calling it, in functions that take any iterable.
Also, for what it's worth:
s1 = set() s2 = s1.copy()
is faster than
s1 = set() s2 = set(s1)
I question your timing skills. Both call the same routine to do the work of copying entries:
set_copy() calls make_new_set() which calls set_update_internal() set_init() calls set_update_internal()
If there is any difference at all, it is the constant overhead of passing an argument to set(), not the implementation itself. The actual set building work is the same.
(and also for dict()) probably because the first is specifically-written for the copy operation whereas the second actually iterates over s1, one item at a time. (At least I think that's what's going on). I suppose that a list().copy() method might also be faster than the other two approaches to copy a list.
You need to read some code, learn about ref counts, etc. There's more to list copying than a memcpy(). If list.copy() were added, it would use that same underlying code as list(s) and s[:]. There would be no speed-up.
Lastly, for completeness, I suppose copy() might be appropriate for both tuple and deque as well.
Tuples? Really? An immutable collection is its own copy.
Raymond
Thanks for the feedback. I also question my timing skills (and most other skills that I think I have). That's what's good about bouncing ideas around here. Silly ones get shot down, and rightly so!
2010/2/10 Gerald Britton gerald.britton@gmail.com:
Lastly, for completeness, I suppose copy() might be appropriate for both tuple and deque as well.
Why would you want to copy a tuple?
On Thu, Feb 11, 2010 at 3:08 AM, Simon Brunning simon@brunningonline.net wrote:
2010/2/10 Gerald Britton gerald.britton@gmail.com:
Lastly, for completeness, I suppose copy() might be appropriate for both tuple and deque as well.
Why would you want to copy a tuple?
Say you had a problem where you started with a basic tuple, then needed to add items to it to produce some result. Now suppose you want to do that repeatedly. You don't want to disturb the basic tuple, so you make a copy of it before extending it.
e.g.
country = ("US",) country_state = tuple(country)+("NY",) country_state_city = tuple(country_state) + ("NY",) country
('US',)
country_state
('US', 'NY')
country_state_city
('US', 'NY', 'NY')
if tuple() had a copy() method, I could write:
country_state = country.copy() + ("NY",)
etc.
Not that this is necessarily "better" in some way. I'm just thinking about consistency across the built-in types. If dict() and set() have copy(), why not list() and tuple()?
On the other hand, if the consensus is _not_ to add the copy() method to lists and tuples, why not deprecate the method in sets and dicts and encourage folks to use the copy module or just use "newdict = dict(olddict)" and "newset = set(oldset)" to build a new dictionary or set from an existing one?
-- Cheers, Simon B. _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On Thu, Feb 11, 2010 at 09:51:42AM -0500, Gerald Britton wrote:
Say you had a problem where you started with a basic tuple, then needed to add items to it to produce some result. Now suppose you want to do that repeatedly. You don't want to disturb the basic tuple
You can never "disturb" a tuple - it's a read-only object.
Oleg.
this seems to work in python 2.x and python3.1, although I suspect it's a bug.
t = (1, 2) t += (3,) t
(1, 2, 3)
On 11 February 2010 14:57, Oleg Broytman phd@phd.pp.ru wrote:
On Thu, Feb 11, 2010 at 09:51:42AM -0500, Gerald Britton wrote:
Say you had a problem where you started with a basic tuple, then needed to add items to it to produce some result. Now suppose you want to do that repeatedly. You don't want to disturb the basic tuple
You can never "disturb" a tuple - it's a read-only object.
Oleg.
Oleg Broytman http://phd.pp.ru/ phd@phd.pp.ru Programmers don't die, they just GOSUB without RETURN.
Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On 02/11/2010 04:21 PM, Matthew Russell wrote:
this seems to work in python 2.x and python3.1, although I suspect it's a bug.
t = (1, 2) t += (3,) t
(1, 2, 3)
I don't see any bug. "x += y" is equal to "x = x + y".
t=(1,2) t2=t t+=(3,) t,t2
((1, 2, 3), (1, 2))
On Thu, Feb 11, 2010 at 03:21:27PM +0000, Matthew Russell wrote:
this seems to work in python 2.x and python3.1, although I suspect it's a bug.
t = (1, 2) t += (3,) t
(1, 2, 3)
It's not a bug. += is not obliged to increase (extend) objects in place. In case of read-only objects += creates a new extended object and returns it:
a = 2 a += 1 print a
3
You don't suppose that 2 magically became 3, do you? Instead += replaces an integer object pointed to by a with a different integer object. The same is true for tuples. The original tuple of len 2 was replaced by a completely new tuple of len 3. If you hold a reference to the original tuple you can find it's still intact:
a = (1, 2) b = a # b *is not* a copy, b holds a reference *to the same tuple* a += (3,) # Now points to a different tuple a
(1, 2, 3)
b # But the original tuple is still the same
(1, 2)
Oleg.
On Thu, Feb 11, 2010 at 10:21, Matthew Russell matt.horizon5@gmail.comwrote:
this seems to work in python 2.x and python3.1, although I suspect it's a bug.
t = (1, 2) t += (3,) t
(1, 2, 3)
The object "t" references at the end isn't the same one that it references
at the beginning. Note the difference between lists and tuples here:
a = [1,2] id(a)
11274840
a += [3,] id(a)
11274840
a is a list; augmented assignment mutates it, but it's still the same object.
b = (1,2) id(b)
13902872
b += (3,) id(b)
13915800
b is a tuple; augmented assignment creates a new object and re-binds "b" to it.
On Thu, Feb 11, 2010 at 10:38 AM, Tim Lesher tlesher@gmail.com wrote:
On Thu, Feb 11, 2010 at 10:21, Matthew Russell matt.horizon5@gmail.com wrote:
this seems to work in python 2.x and python3.1, although I suspect it's a bug.
t = (1, 2) t += (3,) t
(1, 2, 3)
The object "t" references at the end isn't the same one that it references at the beginning. Note the difference between lists and tuples here:
a = [1,2] id(a)
11274840
a += [3,] id(a)
11274840 a is a list; augmented assignment mutates it, but it's still the same object.
b = (1,2) id(b)
13902872
b += (3,) id(b)
13915800
b is a tuple; augmented assignment creates a new object and re-binds "b" to it. -- Tim Lesher tlesher@gmail.com
Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Thanks all for helping me understand this better. The subtly above is something I missed. I searched the doc for a description of it but couldn't readily find it. Tim's simple one-line statement and the example above does it very nicely.
Switching gears for a moment, what is the feeling regarding the copy() methods for dictionaries and sets? Are they truly redundant? Should they be deprecated? Should users be encouraged to use the copy module or just use "newdict = dict(olddict)" and "newset = set(oldset)" to build a new dictionary or set from an existing one?
On 02/11/2010 06:35 PM, Gerald Britton wrote:
Switching gears for a moment, what is the feeling regarding the copy() methods for dictionaries and sets? Are they truly redundant? Should they be deprecated? Should users be encouraged to use the copy module or just use "newdict = dict(olddict)" and "newset = set(oldset)" to build a new dictionary or set from an existing one?
I don't know what one *should* do but I never used .copy() but always dict(olddict) or set(oldset).
-panzi
On 2/11/2010 12:35 PM, Gerald Britton wrote:
On Thu, Feb 11, 2010 at 10:38 AM, Tim Leshertlesher@gmail.com wrote:
Switching gears for a moment, what is the feeling regarding the copy() methods for dictionaries and sets? Are they truly redundant? Should they be deprecated? Should users be encouraged to use the copy module or just use "newdict = dict(olddict)" and "newset = set(oldset)" to build a new dictionary or set from an existing one?
I did not even know that they exist and do not know why they exist. In my opinion, set(x) should special case s being a set/frozenset, and maybe even a dict, and so whatever set.copy does now. Ditto for dict.
Terry Jan Reedy
Gerald Britton writes:
Thanks all for helping me understand this better. The subtly above is something I missed. I searched the doc for a description of it but couldn't readily find it. Tim's simple one-line statement and the example above does it very nicely.
It's in the language reference. It is only two lines (the definition of "immutable" and the description of assignment semantics), so easy to miss. :-) There probably is some discussion in the tutorial.
Switching gears for a moment, what is the feeling regarding the copy() methods for dictionaries and sets? Are they truly redundant? Should they be deprecated? Should users be encouraged to use the copy module or just use "newdict = dict(olddict)" and "newset = set(oldset)" to build a new dictionary or set from an existing one?
I think they are redundant. new = type(old) should be the standard idiom for an efficient shallow copy. If that doesn't serve your application's needs, use the copy module. The responsibility for discrimination is the application programmer's. Superficially this might seem to violate TOOWTDI, but actually, not. Shallow copies and deep copies are two very different "Its", and have to be decided by the app author in any case.
I don't see what .copy can add.
.clear is another matter, in terms of semantics. However, the same effect can be achieve at the cost of indirection and extra garbage:
class DictWithClear(object): def __init__(self): self.clear()
def clear(self): d = {}
# Implement other dict methods here.
This is obviously wasteful if all you want to do is add .clear to a "bare" dictionary. However, in many cases the dictionary is an attribute of a larger structure already and the only direct reference to the dictionary is from that structure. Then clearing by replacing the obsolete dictionary with a fresh empty one is hardly less efficient than clearing the obsolete contents.
There are other arguments *for* the .clear method (eg, it would be a possibly useful optimization if instead of a class with a dictionary attribute, the class inherited from the dictionary).
Speaking of new potential list methods, how about list.get(index, default=None) ala dict.get ? I'm sure this has must have come up at some point but can't find it ATM.
George
George Sakkis wrote:
Speaking of new potential list methods, how about list.get(index, default=None) ala dict.get ? I'm sure this has must have come up at some point but can't find it ATM.
I believe it runs afoul of the moratorium, but a getitem() builtin might be a better idea (since it would then work for any class that implements __getitem__).
Cheers, Nick.
On 12 Feb 2010, at 10:58 , Nick Coghlan wrote:
George Sakkis wrote:
Speaking of new potential list methods, how about list.get(index, default=None) ala dict.get ? I'm sure this has must have come up at some point but can't find it ATM.
I believe it runs afoul of the moratorium, but a getitem() builtin might be a better idea (since it would then work for any class that implements __getitem__).
Maybe just extending operator.itemgetter with a "default" kwarg? Wouldn't run afoul the moratorium, and would be quite a nice extension to itemgetter.
Though I'm not sure it's a very good idea for lists. Semantically, lists are to be iterated, not really to be indexed.
Masklinn wrote:
On 12 Feb 2010, at 10:58 , Nick Coghlan wrote:
George Sakkis wrote:
Speaking of new potential list methods, how about list.get(index, default=None) ala dict.get ? I'm sure this has must have come up at some point but can't find it ATM.
I believe it runs afoul of the moratorium, but a getitem() builtin might be a better idea (since it would then work for any class that implements __getitem__).
Maybe just extending operator.itemgetter with a "default" kwarg? Wouldn't run afoul the moratorium, and would be quite a nice extension to itemgetter.
Yeah, a kw-only arg for itemgetter and attrgetter could definitely work. It would be somewhat clunky to use though.
Though I'm not sure it's a very good idea for lists. Semantically, lists are to be iterated, not really to be indexed.
Using short lists as record sets happens all the time (especially with things like str.split and other parsing operations that build up their results incrementally).
Cheers, Nick.
Masklinn wrote:
Though I'm not sure it's a very good idea for lists. Semantically, lists are to be iterated, not really to be indexed.
I don't think I agree that lists are meant primarily for iteration. Indexing them is a perfectly legitimate and useful thing to do.
However, I would agree that a list.get() operation with a default seems to be a rather rare requirement. Usually when you index a list, the index is generated by some algorithm that guarantees it's within range.
I can't remember ever wanting a list.get() myself, and if I ever did, I would be quite happy to write my own.
On Feb 12, 2010, at 2:33 PM, Greg Ewing wrote:
Masklinn wrote:
Though I'm not sure it's a very good idea for lists. Semantically, lists are to be iterated, not really to be indexed.
I don't think I agree that lists are meant primarily for iteration. Indexing them is a perfectly legitimate and useful thing to do.
However, I would agree that a list.get() operation with a default seems to be a rather rare requirement. Usually when you index a list, the index is generated by some algorithm that guarantees it's within range.
I can't remember ever wanting a list.get() myself, and if I ever did, I would be quite happy to write my own.
I concur. Put me down for a -1.
Also, I think this idea was discussed once or twice here before and it got shot down then too.
Raymond
Nick Coghlan wrote:
I believe it runs afoul of the moratorium, but a getitem() builtin might be a better idea (since it would then work for any class that implements __getitem__).
Not quite the same way, though. The dict get() method knows about the internals of the object, so it can work very efficiently and without danger of masking bugs by catching the wrong exception. A generic getitem() wouldn't be able to do that.
Thanks Tim. The dict and set types _do_ have clear() methods, but not the list() type. I first ran into this sometime ago when a question was posted about it. It intrigued me because I saw what I thought was a gap. Basically I like things to be consistent. I was also wondering about garbage collection. If I have a humongous list, e.g. and "clear" it with:
mylist = []
does the old content not need to be garbage collected? Might it not continue to occupy its memory for a while? OTOH do dict.clear() and set.clear() immediately free their memory or does it just get queued for garbage collection?
On Thu, Feb 11, 2010 at 9:54 PM, Stephen J. Turnbull turnbull@sk.tsukuba.ac.jp wrote:
Gerald Britton writes:
> Thanks all for helping me understand this better. The subtly above is > something I missed. I searched the doc for a description of it but > couldn't readily find it. Tim's simple one-line statement and the > example above does it very nicely.
It's in the language reference. It is only two lines (the definition of "immutable" and the description of assignment semantics), so easy to miss. :-) There probably is some discussion in the tutorial.
> Switching gears for a moment, what is the feeling regarding the copy() > methods for dictionaries and sets? Are they truly redundant? Should > they be deprecated? Should users be encouraged to use the copy module > or just use "newdict = dict(olddict)" and "newset = set(oldset)" to > build a new dictionary or set from an existing one?
I think they are redundant. new = type(old) should be the standard idiom for an efficient shallow copy. If that doesn't serve your application's needs, use the copy module. The responsibility for discrimination is the application programmer's. Superficially this might seem to violate TOOWTDI, but actually, not. Shallow copies and deep copies are two very different "Its", and have to be decided by the app author in any case.
I don't see what .copy can add.
.clear is another matter, in terms of semantics. However, the same effect can be achieve at the cost of indirection and extra garbage:
class DictWithClear(object): def __init__(self): self.clear()
def clear(self): d = {}
# Implement other dict methods here.
This is obviously wasteful if all you want to do is add .clear to a "bare" dictionary. However, in many cases the dictionary is an attribute of a larger structure already and the only direct reference to the dictionary is from that structure. Then clearing by replacing the obsolete dictionary with a fresh empty one is hardly less efficient than clearing the obsolete contents.
There are other arguments *for* the .clear method (eg, it would be a possibly useful optimization if instead of a class with a dictionary attribute, the class inherited from the dictionary).
Gerald Britton writes:
I was also wondering about garbage collection. If I have a humongous list, e.g. and "clear" it with:
mylist = []
does the old content not need to be garbage collected?
thislist = generate_me_a_humongous_list() thatlist = thislist thatlist = []
Definitely no garbage collection.
The starting point of using garbage collection is that in general you don't know *locally* whether something is reachable or not. So you need to do a global analysis.
OTOH do dict.clear() and set.clear() immediately free their memory or does it just get queued for garbage collection?
This is covered in the manuals, but the gist is that every Python object knows how many other objects are pointing to it (called a refcount). When an object's refcount drops to zero, it gets collected (immediately, IIRC). However ...
thislist = [] thatlist = [thislist] thislist.append(thatlist)
and you have a reference cycle. These cycles are also collected, but this requires more effort, and so it is done only occasionally.
Stephen J. Turnbull wrote:
This is covered in the manuals, but the gist is that every Python object knows how many other objects are pointing to it (called a refcount). When an object's refcount drops to zero, it gets collected (immediately, IIRC).
This description applies for CPython (the one from python.org), since that uses refcounting with cyclic garbage collection. Other Python implementations work differently (e.g. Jython and IronPython rely on the garbage collector in their underlying VMs)
Cheers, Nick.
Thanks everyone. Good to know!
On Fri, Feb 12, 2010 at 10:45 AM, Nick Coghlan ncoghlan@gmail.com wrote:
Stephen J. Turnbull wrote:
This is covered in the manuals, but the gist is that every Python object knows how many other objects are pointing to it (called a refcount). When an object's refcount drops to zero, it gets collected (immediately, IIRC).
This description applies for CPython (the one from python.org), since that uses refcounting with cyclic garbage collection. Other Python implementations work differently (e.g. Jython and IronPython rely on the garbage collector in their underlying VMs)
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 11 February 2010 14:51, Gerald Britton gerald.britton@gmail.com wrote:
Say you had a problem where you started with a basic tuple, then needed to add items to it to produce some result.
Bzzzz! Tuples are immutable - you can't add items to them.
On 11 February 2010 14:51, Gerald Britton gerald.britton@gmail.com wrote:
Say you had a problem where you started with a basic tuple, then needed to add items to it to produce some result. Now suppose you want to do that repeatedly. You don't want to disturb the basic tuple, so you make a copy of it before extending it.
e.g.
country = ("US",) country_state = tuple(country)+("NY",) country_state_city = tuple(country_state) + ("NY",) country
('US',)
country_state
('US', 'NY')
country_state_city
('US', 'NY', 'NY')
if tuple() had a copy() method, I could write:
country_state = country.copy() + ("NY",)
etc.
You do know that tuples are immutable, don't you?
Python 2.6.4 (r264:75708, Oct 26 2009, 08:23:19) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.
country = ("US",) country_state = country+("NY",) country_state_city = country_state + ("NY",) country
('US',)
country_state
('US', 'NY')
country_state_city
('US', 'NY', 'NY')
It sounds like you could do with reading the Python documentation a bit more closely before proposing changes...
Paul
On Thu, Feb 11, 2010 at 3:51 PM, Gerald Britton gerald.britton@gmail.com wrote:
country = ("US",) country_state = tuple(country)+("NY",) country_state_city = tuple(country_state) + ("NY",) country
('US',)
country_state
('US', 'NY')
country_state_city
('US', 'NY', 'NY')
if tuple() had a copy() method, I could write:
country_state = country.copy() + ("NY",)
Note that for a tuple T
tuple(T) == T
So you can already write:
country_state = country + ("NY",)
and it will already have exactly the same effect that tuple(country) or your proposed country.copy() would have.
On Thu, Feb 11, 2010 at 4:11 PM, Andre Engels andreengels@gmail.com wrote:
Note that for a tuple T
tuple(T) == T
Of course what I actually meant was:
tuple(T) is T
Mathias Panzenböck wrote:
On 02/10/2010 11:39 AM, wxyarv wrote:
what about another method clone() (or copy())?
Last time copying lists was discussed, I seem to remember there were considered to be too many ways of doing it already, so I can't see another one being added.
Greg Ewing wrote:
Mathias Panzenböck wrote:
On 02/10/2010 11:39 AM, wxyarv wrote:
what about another method clone() (or copy())?
Last time copying lists was discussed, I seem to remember there were considered to be too many ways of doing it already, so I can't see another one being added.
With the obvious way of getting a consistent interface already being to use copy.copy.
Cheers, Nick.