Hello there. Please see my feature request: http://bugs.python.org/issue6326 The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth. In particular, it can help reduce the cost of creating instances of list subclasses, from raw lists: class mylist(list): def first(self): return self[0] m = mylist(source_list) This certainly creates m, but does so by copying source_list. If all we want to do is turn source_list into a mylist instance, it is much quicker to: m = mylist() m.swap(source_list) See the above issue for initial comments, especially concerns on how this can bypass all kind of insertion semantics of list subclasses. Cheers, Kristján
[Kristján Valur Jónsson]
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
The technique is useful and very fast. I used something similar in setobject.c in the set_swap_bodies() function, but it was sufficiently dangerous to subclass invariants that it was not exposed to the end-user. Subclasses written as C extensions are the most in danger because a swap could break their internal invariants and possibly crash the code. Pure python list subclasses are only in danger of breaking without crashing. While Python is a consenting adults language, I think the basic objects like lists should be kept free of dangerous or hard-to-use constructs. The problem with the OP's example is that it only makes sense in interactions between a list and a list subclass that won't be broken by it. For straight list-to-list interactions, it is better to write "a,b = b,a" (though this is not exactly the same thing since the identity of a and b will change, not just their contents). For list subclass uses (a more advanced topic), some example will work and some won't (I gave two failing examples in the tracker discussion). When the list subclass is a C extension written by a third-party, it may not be possible to know whether or not a swap will break invariants. That's a killer. There are a number of places in the language where swapping could be used (frozenset to set conversions for example) but the technique is fragile and probably should not become part of the language as distributed. In the OP's use case, it is not hard to build a C extension for his subclasses and include a swap() method there. That is a much less fragile design because the subclass already knows about its own internal invariants and it can verify that its input source is an exact list. An OOP design principle is that a method should be added to the class that has the most knowledge about all of the inputs (in this case, the C extension subclass knows more than general purpose lists). A side benefit of this design (putting swap() in the list subclass instead of standard list objects) is that this avoids putting dangerous optimizations in the hands of beginning users (i.e. it keeps the list API simple and clean). Raymond
On Tue, Jun 23, 2009, Raymond Hettinger wrote:
[Kristj?n Valur J?nsson]
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
The problem with the OP's example is that it only makes sense in interactions between a list and a list subclass that won't be broken by it. For straight list-to-list interactions, it is better to write "a,b = b,a" (though this is not exactly the same thing since the identity of a and b will change, not just their contents). For list subclass uses (a more advanced topic), some example will work and some won't (I gave two failing examples in the tracker discussion). When the list subclass is a C extension written by a third-party, it may not be possible to know whether or not a swap will break invariants. That's a killer.
Thanks for the detailed response and explanation of why Kristjan wants the feature and why it isn't the same thing as the standard tuple swap. Your assessment makes sense, and I agree that this doesn't belong as a standard list feature. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha
I agree that as presented, it is a bit radical due to the concerns raised by Raymond and others. But I still think it is a useful recipe for those that know what they are doing, and then perhaps as the standalone function, SwapLists(a,b) like we implemented in-house for our purposes. I wonder, where do such recipes belong? K
-----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com@python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Aahz Sent: 23. júní 2009 23:15 To: python-ideas@python.org Subject: Re: [Python-ideas] add a list.swap() method
On Tue, Jun 23, 2009, Raymond Hettinger wrote:
[Kristj?n Valur J?nsson]
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and
so forth.
The problem with the OP's example is that it only makes sense in interactions between a list and a list subclass that won't be broken by it. For straight list-to-list interactions, it is better to write "a,b = b,a" (though this is not exactly the same thing since the identity of a and b will change, not just their contents). For list subclass uses (a more advanced topic), some example will work and
some
won't (I gave two failing examples in the tracker discussion). When the list subclass is a C extension written by a third-party, it may not be possible to know whether or not a swap will break invariants. That's a killer.
Thanks for the detailed response and explanation of why Kristjan wants the feature and why it isn't the same thing as the standard tuple swap. Your assessment makes sense, and I agree that this doesn't belong as a standard list feature.
On Wed, Jun 24, 2009, Kristj?n Valur J?nsson wrote:
I agree that as presented, it is a bit radical due to the concerns raised by Raymond and others. But I still think it is a useful recipe for those that know what they are doing, and then perhaps as the standalone function, SwapLists(a,b) like we implemented in-house for our purposes. I wonder, where do such recipes belong?
Python Cookbook, for starters: http://www.activestate.com/ASPN/Python/Cookbook/ You might also upload a small module to PyPI. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha
Raymond Hettinger wrote:
[Kristján Valur Jónsson]
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
The technique is useful and very fast. I used something similar in setobject.c in the set_swap_bodies() function, but it was sufficiently dangerous to subclass invariants that it was not exposed to the end-user.
Subclasses written as C extensions are the most in danger because a swap could break their internal invariants and possibly crash the code. Pure python list subclasses are only in danger of breaking without crashing.
While Python is a consenting adults language, I think the basic objects like lists should be kept free of dangerous or hard-to-use constructs.
The problem with the OP's example is that it only makes sense in interactions between a list and a list subclass that won't be broken by it. For straight list-to-list interactions, it is better to write "a,b = b,a" (though this is not exactly the same thing since the identity of a and b will change, not just their contents). For list subclass uses (a more advanced topic), some example will work and some won't (I gave two failing examples in the tracker discussion). When the list subclass is a C extension written by a third-party, it may not be possible to know whether or not a swap will break invariants. That's a killer.
There are a number of places in the language where swapping could be used (frozenset to set conversions for example) but the technique is fragile and probably should not become part of the language as distributed.
In the OP's use case, it is not hard to build a C extension for his subclasses and include a swap() method there. That is a much less fragile design because the subclass already knows about its own internal invariants and it can verify that its input source is an exact list. An OOP design principle is that a method should be added to the class that has the most knowledge about all of the inputs (in this case, the C extension subclass knows more than general purpose lists). A side benefit of this design (putting swap() in the list subclass instead of standard list objects) is that this avoids putting dangerous optimizations in the hands of beginning users (i.e. it keeps the list API simple and clean).
I learned from this. This seems like an appropriate wiki topic or recipe for advanced extension writers.
On Wed, 24 Jun 2009 06:23:31 am Kristján Valur Jónsson wrote:
Hello there. Please see my feature request: http://bugs.python.org/issue6326
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth. In particular, it can help reduce the cost of creating instances of list subclasses, from raw lists:
class mylist(list): def first(self): return self[0]
m = mylist(source_list)
Is that overhead really significant enough that it needs a special operation to avoid it? In the tracker, you claim a significant performance gain, but I'd like to hear more details.
This certainly creates m, but does so by copying source_list.
Only the pointers, not the objects pointed to. But I'm sure you already know this :)
If all we want to do is turn source_list into a mylist instance, it is much quicker to: m = mylist() m.swap(source_list)
This has a side-effect of turning source_list into an empty mylist. I don't like that one bit.
See the above issue for initial comments, especially concerns on how this can bypass all kind of insertion semantics of list subclasses.
Can you address those concerns? One problem with swap() being a list method is that it allows people to arbitrarily swap the contents of lists around, breaking subclass invariants. I have an alternative suggestion: put the swap into list.__new__, so it can only be done at list initialisation time. list.__new__(cls, source) currently returns an empty list of type cls, which is then initialised by list.__init__(self, source), presumably by copying source. Perhaps list.__new__ could take an additional argument that says "swap the contents of the new list to use source without copying". That means, instead of writing: a = mylist() a.swap([1, 2, 3]) you would put the fast initialisation logic into the class: class mylist(list): def __new__(cls, source): return list.__new__(cls, source, swap=True) def __init__(self, source): pass # avoid calling list.__init__ Existing code should work unchanged if the default for swap is False. -- Steven D'Aprano
Kristján Valur Jónsson <krist...@ccpgames.com> wrote:
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
I think I'd use a mapping to hold the lists so I could do a straight re-assignment: d['a'], d['b'] = d['b'], d['a'] Of course, all references to the lists would have to be brokered through the mapping object, but I tend to find it easier to do so anyway.
-----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com@python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com@python.org] On Behalf Of alex23 Sent: 25. júní 2009 07:01 To: python-ideas@python.org Subject: Re: [Python-ideas] add a list.swap() method
Kristján Valur Jónsson <krist...@ccpgames.com> wrote:
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
I think I'd use a mapping to hold the lists so I could do a straight re-assignment:
d['a'], d['b'] = d['b'], d['a']
This is roughly equivalent to a, b = b, a right? That's a completely different thing. Also, the whole point of this excersise is performance. This is why in our case, for example, we were using a list subclass (CRowset) in stead of wrapping the list, to shave off cpu cycles when accessing rows and columns in a rowset. A CRowset is essentially a list, with an extra attribute (header). So, maybe I should rephrase the "idea." The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded. The semantics of a.swap(b) are the same as a[:], b[:] = b[:], a[:] but it a) is orders of magnitude faster b) skips any special methods (such as __setslice__) and so may break class invariants of list subclasses, as pointed out on this list. K
Kristján Valur Jónsson wrote:
So, maybe I should rephrase the "idea." The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded. The semantics of a.swap(b) are the same as a[:], b[:] = b[:], a[:] but it a) is orders of magnitude faster b) skips any special methods (such as __setslice__) and so may break class invariants of list subclasses, as pointed out on this list.
As Raymond posted, the potential breakage is a major downside (and almost certainly a deal breaker without some safety rails, and possibly still a deal breaker even with them). The name also concerns me, since I think it is misleading as to the intended usage of the method. My suggestions: 1. Make it a class method "from_list". Document its semantics as something like: @classmethod def from_list(cls, other): """Fast but destructive conversion of a list instance to a different type of list instance""" self = cls() self[:] = other[:] other[:] = [] return self 2. Put in some safety rails, similar to what we do to prevent hashing of classes that override __eq__ without overriding __hash__. If __setitem__ and __setslice__ on cls or other aren't the same as the base class definitions, don't allow the operation to proceed (i.e. throw a TypeError saying that the operation needs list instances that use the standard value setting semantics). 3. Possibly provide some C API level tools to make it easier for a list subclass to write their own fast constructor based on this idea I'm not sure it is actually practical to make munging about with list internals like this "safe" enough to be comfortably exposed at the Python level, but ideas like the above would at least be a step in that direction. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Kristján Valur Jónsson wrote:
So, maybe I should rephrase the "idea." The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded. The semantics of a.swap(b) are the same as a[:], b[:] = b[:], a[:] but it a) is orders of magnitude faster b) skips any special methods (such as __setslice__) and so may break class invariants of list subclasses, as pointed out on this list.
I'm not sure it is actually practical to make munging about with list internals like this "safe" enough to be comfortably exposed at the Python level, but ideas like the above would at least be a step in that direction.
It sounds to me like what is really desired in this case is not a copy/swap operation but a type mutate operation on the data. So instead of copying the data, the data's type is mutated to another type providing that the data format is compatible. So the above becomes something like... a_type = type(a) b_type = type(b) a.mutate_type_to(b_type) b.mutate_type_to(a_type) But the more common use would be just to convert a single data object to another similar data object in a much more efficient way. Would that even be possible? A class would need to know what other classes it is data compatible with somehow. Possibly that could be done by registering that before use. Then it would need the machinery to do that, which I have no idea about. The advantage is it's a more general way to approach the problem that could save a lot of data copying in more situations. Ok, it's a wild idea of sorts, but this is the idea list. ;-) Cheers, Ron
Ron Adam wrote:
Nick Coghlan wrote:
Kristján Valur Jónsson wrote:
So, maybe I should rephrase the "idea." The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded. The semantics of a.swap(b) are the same as a[:], b[:] = b[:], a[:] but it a) is orders of magnitude faster b) skips any special methods (such as __setslice__) and so may break class invariants of list subclasses, as pointed out on this list.
It sounds to me like what is really desired in this case is not a copy/swap operation but a type mutate operation on the data. So instead of copying the data, the data's type is mutated to another type providing that the data format is compatible.
'Compatible' is the keyword. As I understand him, Kristján's problem is initializating list subclasses with additional fields, which are *not* compatible. Class mutation is already possible for user-defined classes.
Nick Coghlan wrote:
My suggestions:
1. Make it a class method "from_list". Document its semantics as something like:
@classmethod def from_list(cls, other): """Fast but destructive conversion of a list instance to a different type of list instance""" self = cls() self[:] = other[:] other[:] = [] return self
2. Put in some safety rails, similar to what we do to prevent hashing of classes that override __eq__ without overriding __hash__. If __setitem__ and __setslice__ on cls or other aren't the same as the base class definitions, don't allow the operation to proceed (i.e. throw a TypeError saying that the operation needs list instances that use the standard value setting semantics).
Suppose 'fromlist' literally meant from 'from a list object' and not from list subclasses. Would not that be clearly safe? Kristján, would that restriction be okay for your purposes, or do you really need from subclass to subclass?
3. Possibly provide some C API level tools to make it easier for a list subclass to write their own fast constructor based on this idea
I'm not sure it is actually practical to make munging about with list internals like this "safe" enough to be comfortably exposed at the Python level, but ideas like the above would at least be a step in that direction.
Cheers, Nick.
On Thu, Jun 25, 2009, Kristj?n Valur J?nsson wrote:
From: alex23
Kristj?n Valur J?nsson <krist...@ccpgames.com> wrote:
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
I think I'd use a mapping to hold the lists so I could do a straight re-assignment:
d['a'], d['b'] = d['b'], d['a']
This is roughly equivalent to a, b = b, a right?
Yes and no. The point is that anyone holding a reference to ``d`` will see the changes to d['a'] and d['b'] (this would also apply to using an instance to proxy the changes through attributes). It's the usual trick for handling wholesale updates to mutable objects. Therefore it doesn't have the problem of the plain tuple-swap of losing the references. For any use-case that really is swapping as opposed to the fast initialization that you want, alex23's idea is a good kludge, which limits the scope of usefulness for what you want. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "as long as we like the same operating system, things are cool." --piranha
Kristján Valur Jónsson wrote:
-----Original Message-----
The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded.
Which is why 'swap' is confusing. You want to swap list-body pointers, which is meaningless in Python itself. Nick's 'fromlist' is better. It describes the effect without reference to the backstage magic.
-----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com@python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Terry Reedy Sent: 25. júní 2009 17:58 To: python-ideas@python.org Subject: Re: [Python-ideas] add a list.swap() method
Kristján Valur Jónsson wrote:
-----Original Message-----
The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded.
Which is why 'swap' is confusing. You want to swap list-body pointers, which is meaningless in Python itself. Nick's 'fromlist' is better. It describes the effect without reference to the backstage magic.
It is only one application, the other is a fast implementation of a[:], b[:] = b[:], a[:] There is precedence, I think. list.reverse() is semantically identical to list[:] = reversed(list) but it doesn't invoke any magic methods, and it relies on "backstage magic" to do its work in the most efficient way. Same goes for sort(), really. The difference with swap(), is that it will also magically affect an operand list. But I think that both reverse() and sort() can be dangerous to list subclasses with class invariants. I think there are other cases. list.pop() doesn't appear to invoke any magic methods either, and so must be a killer for list invariants. K
-1 I've done some pretty shady things with Python over the years, but swapping the contents of two lists (or list subclasses) isn't one. That this can be implemented in a properly formatted 2-line function (modulo your function naming style): def SwapLists(a, b): a[:], b[:] = b[:], a[:] ... necessarily requires me to invoke "not all x line functions should be built-in (or methods)". As for using list.sort(), list.reverse(), or even list.pop() as examples of why list.swap() should exist, that doesn't fly. Sorting, reversing, and popping are *very* common operations in Python code. Swapping list contents? ...not so common. If you're really swapping contents that often, you may want to consider a different design with a better algorithm (swap 2 pointers rather than n+m), rather than speeding up the slow algorithm. - Josiah 2009/6/25 Kristján Valur Jónsson <kristjan@ccpgames.com>:
-----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com@python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Terry Reedy Sent: 25. júní 2009 17:58 To: python-ideas@python.org Subject: Re: [Python-ideas] add a list.swap() method
Kristján Valur Jónsson wrote:
-----Original Message-----
The idea is for example to speed up the initialization of lists (and subclasses of lists) from other lists when the source list's contents is to be discarded.
Which is why 'swap' is confusing. You want to swap list-body pointers, which is meaningless in Python itself. Nick's 'fromlist' is better. It describes the effect without reference to the backstage magic.
It is only one application, the other is a fast implementation of a[:], b[:] = b[:], a[:]
There is precedence, I think. list.reverse() is semantically identical to list[:] = reversed(list)
but it doesn't invoke any magic methods, and it relies on "backstage magic" to do its work in the most efficient way. Same goes for sort(), really.
The difference with swap(), is that it will also magically affect an operand list. But I think that both reverse() and sort() can be dangerous to list subclasses with class invariants.
I think there are other cases. list.pop() doesn't appear to invoke any magic methods either, and so must be a killer for list invariants.
K
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
[Josiah Carlson]
If you're really swapping contents that often, you may want to consider a different design with a better algorithm (swap 2 pointers rather than n+m), rather than speeding up the slow algorithm.
That is what the OP was requesting. He needs a pointer swap for speed. His problem is that his input is a regular list and the target is a custom list subclass. Python currently provides no way for the subclass to steal the pointer from a regular list. My recommended approach for him is to write a simple, reusable list subclass in C, one that knows how to steal a pointer from a regular list. Then he could subclass from that C extension: class MyFastListBasedThingie(myextensions.SwappableList): def __init__(self, some_regular_list): if type(some_regular_list) == list: # an exact match self.swap_list_pointers(some_regular_list) # the fast way else: self.extend(some_regular_list) # the slow way del some_regular_list[:] The advantage of this design is that it puts responsibility for swapping in the hands of the class that knows all of the relevant information (MyFastListBasedThingie knows its own invariants and it knows that the regular list input is an exact list with no special invariants). See http://en.wikipedia.org/wiki/GRASP_(Object_Oriented_Design) for the InformationExpert pattern. The other advantage is that it leaves the API for general purpose lists completely unmolested. That API needs to be as friendly as possible. [Kristján Valur Jónsson]
There is precedence, I think. list.reverse() is semantically identical to list[:] = reversed(list)
It's not identical. There is no list copy or mutation. Instead it returns either a custom list iterator provided by the listobject itself (which knows its own invariants) or it returns an iterator that calls the list's own __getitem__ method. There is no danger to the target of reversed().
Same goes for sort(), really.
The sort() method is closer in spirit. It does mutate the list in-place, so any list subclasses must either disable sort() by overriding it or it needs to limit its invariants to ones that aren't affected by the mutation. This is unfortunate because it places a burden on every class that inherits from list. So even though you've found a precedent, I don't think it is something that we want to have more of. Also, the situation for swap() is made more difficult because there are two parties (the swapper and the swappee). With sort(), a subclass is giving permission for mutation by not overriding the sort method. With swap(), the swappee has no say in the matter. There is no way for a swappee's class to declare that its invariants do not support swapping. The situation is even more acute when the swappee is a C extension inheriting from list -- violating its invariants may crash the interpreter. Raymond
Since this is Python-ideas, I'm going to persist a bit with my argumentation, even though I concede defeat :) I'm going for a generic SwapLists(a,b) function in C, since my list subclass isn't written in C.
-----Original Message----- From: Raymond Hettinger [mailto:python@rcn.com] Sent: 25. júní 2009 21:58
The sort() method is closer in spirit. It does mutate the list in- place, so any list subclasses must either disable sort() by overriding it or it needs to limit its invariants to ones that aren't affected by the mutation. This is unfortunate because it places a burden on every class that inherits from list. So even though you've found a precedent, I don't think it is something that we want to have more of.
Since I started uncovering list bad-boys, I think perhaps list.pop() might be the evilest of the bunch in this context, since it does change the list length and so might crash unwary C extensions with some invariants.
Also, the situation for swap() is made more difficult because there are two parties (the swapper and the swappee). With sort(), a subclass is giving permission for mutation by not overriding the sort method. With swap(), the swappee has no say in the matter. There is no way for a swappee's class to declare that its invariants do not support swapping. The situation is even more acute when the swappee is a C extension inheriting from list -- violating its invariants may crash the interpreter.
I do have a suggested solution for that: Require that the swapee is not a list subclass, by using PyList_CheckExact(). That does make it safer at the cost of narrowing the application domain. Anyway, I'm going to mark the issue as "rejected" and get on with life. Thank you Raymond for your insightful comments. K
On Fri, 26 Jun 2009 08:37:36 am Kristján Valur Jónsson wrote:
Since I started uncovering list bad-boys, I think perhaps list.pop() might be the evilest of the bunch in this context, since it does change the list length and so might crash unwary C extensions with some invariants.
True, the caller can shoot himself in the foot by bypassing your class and calling list.pop(instance) instead of instance.pop(). That's a bad thing, but probably unavoidable given Python's OO design. However, it's also a fairly unusual thing to do: like calling "private" double-underscore methods, anyone calling superclass.method(instance) instead of instance.method() will know that they're doing something unsupported and potentially dangerous. Since we agree it's a bad thing, why do you want to copy it and create another method which is even more dangerous, by design? As I see it, Python already has a perfectly good solution for swapping the contents of arbitrary mutable sequences, one which doesn't break any invariants: a[:], b[:] = b[:], a[:] What you really want is a fast way of initialising a list, given another list. That was your use-case, after all. Swapping the contents is just one particular implementation of that fast-initialise. I believe that the correct place for this is inside a list constructor, not as an external method which magically transports the internal data from one list into another. I suggested adding extra functionality to list.__new__, and Nick suggested a new constructor list.from_list(). Nick suggested making from_list destructive, but I don't see the advantage of doing so. I think this is a probably a more productive direction to take than a swap() method. -- Steven D'Aprano
[Steven D'Aprano]
I suggested adding extra functionality to list.__new__, and Nick suggested a new constructor list.from_list(). Nick suggested making from_list destructive, but I don't see the advantage of doing so. I
FWIW, the constructor for list objects already has a fast path for copying an input list or tuple. So the only new thing you can add is a destructive, steal the pointer approach that breaks encapsulation and opens a can of worms. Raymond
Steven D'Aprano wrote:
On Fri, 26 Jun 2009 08:37:36 am Kristján Valur Jónsson wrote:
Since I started uncovering list bad-boys, I think perhaps list.pop() might be the evilest of the bunch in this context, since it does change the list length and so might crash unwary C extensions with some invariants.
True, the caller can shoot himself in the foot by bypassing your class and calling list.pop(instance) instead of instance.pop(). That's a bad thing, but probably unavoidable given Python's OO design.
I think Kristján's point was that list.pop() and friends don't perform a PyList_CheckExact on themselves before following their fast path implementation so they can bypass custom get/set/del methods in subclasses. So subclasses that override __delitem__ to do some additional bookkeeping without overriding pop() could get a nasty surprise when someone calls pop() on them. This would be similar to the oddities that can happen when you override dict.__setitem__ without overriding dict.update(). Then again, maybe we just need to document this behaviour as a trap of subclassing the mutable builtins: make sure to override *all* the mutating methods in subclasses, not just the special methods. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
-----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com@python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com@python.org] On Behalf Of Nick Coghlan Sent: 26. júní 2009 09:06 To: Steven D'Aprano Cc: python-ideas@python.org Subject: Re: [Python-ideas] add a list.swap() method
I think Kristján's point was that list.pop() and friends don't perform a PyList_CheckExact on themselves before following their fast path implementation so they can bypass custom get/set/del methods in subclasses.
Right. I've already agreed that my suggestion idea is too radical. I'm merely observing that our discussion has pointed out that other list member functions can be dangerous for list subclasses since they too perform "magic shortcuts" to achieve their goal. So, those that write list subclasses that maintain internal state based on their members, must be careful to override all of these magic list functions. They are list.append, list.pop, list.reverse, list.sort, list.insert, that I can see with a casual perusal of the source. In particular, append(), pop() and insert() modify the list length, which might confuse some subclasses. Cheers, K
[Kristján Valur Jónsson]
In particular, append(), pop() and insert() modify the list length, which might confuse some subclasses.
These are all part of the most basic functions of list. A subclasser should know that. They are different from swap() which is an optimization hack that breaks encapsulation and is not a basic function of lists. The docs are fine as-is. Adding esoterica to the docs tends to make the docs harder to digest and less useful to people trying to learn the language. Raymond
Raymond Hettinger wrote:
[Kristján Valur Jónsson]
In particular, append(), pop() and insert() modify the list length, which might confuse some subclasses.
These are all part of the most basic functions of list.
I have to chuckle a bit at that statement. When I posted the proposal to add .pop() (before there were formal 'PEP's), a couple of people accused my of trying to ruin Python. So agree, even if they did not, and I am glad you think so too ;-)
A subclasser should know that. They are different from swap() which is an optimization hack that breaks encapsulation and is not a basic function of lists. The docs are fine as-is.
Adding esoterica to the docs tends to make the docs harder to digest and less useful to people trying to learn the language.
As someone who hopes to be introducing Python to new people, I appreciate that you keep digestibility in mind, even if I occasionally forget in the quest for technical accuracy (which is also important). Terry Jan Reedy
Raymond Hettinger wrote:
[Kristján Valur Jónsson]
In particular, append(), pop() and insert() modify the list length, which might confuse some subclasses.
These are all part of the most basic functions of list. A subclasser should know that. They are different from swap() which is an optimization hack that breaks encapsulation and is not a basic function of lists. The docs are fine as-is.
Adding esoterica to the docs tends to make the docs harder to digest and less useful to people trying to learn the language.
I was thinking more in terms of the C API docs rather than the normal list docs. While Python subclasses can run into bugs due to this they're unlikely to crash the interpreter by doing so. C subclasses on the other hand may have bigger problems. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
Nick Coghlan wrote:
Then again, maybe we just need to document this behaviour as a trap of subclassing the mutable builtins: make sure to override *all* the mutating methods in subclasses, not just the special methods.
I would never have imagined things were otherwise. This is a general principle that applies to all classes, whether builtin or not: unless explicitly documented, you can't make any assumptions about how methods call each other internally. -- Greg
On 25 Jun 2009, at 17:41, Steven D'Aprano wrote:
What you really want is a fast way of initialising a list, given another list. That was your use-case, after all. Swapping the contents is just one particular implementation of that fast-initialise.
FWIW, this use case really sounds a lot like rvalue references, a new feature being added to the new C++0x standard. The basic idea is that, for efficiency reasons, you want to be able to "sink" the contents an object and leave the object in a destructable (but not necessarily valid) state. This would be a tough (and probably inappropriate?) feature in Python, but I mention it because the concept might provide further context/ justifications/trade-offs/ideas for what is being discussed here. Jared
Jared Grubb wrote:
On 25 Jun 2009, at 17:41, Steven D'Aprano wrote:
What you really want is a fast way of initialising a list, given another list. That was your use-case, after all. Swapping the contents is just one particular implementation of that fast-initialise.
FWIW, this use case really sounds a lot like rvalue references, a new feature being added to the new C++0x standard. The basic idea is that, for efficiency reasons, you want to be able to "sink" the contents an object and leave the object in a destructable (but not necessarily valid) state.
This would be a tough (and probably inappropriate?) feature in Python, but I mention it because the concept might provide further context/justifications/trade-offs/ideas for what is being discussed here.
The OP might consider code that does the swap if the refcount is 1, otherwise does something akin to recur(arg.copy()) (which _will_ have a refcount of 0). --Scott David Daniels Scott.Daniels@Acm.Org
Kristján Valur Jónsson <krist...@ccpgames.com> wrote:
The idea is to speed up the swapping of list elemenst so that a.swap(b) is equivalent to a[:], b[:] = b[:], a[:] but without all the overhead of creating slices, assigning them and so forth.
I think I'd rather use a mapping to hold the lists so I could do a straight re-assignment: d['a'], d['b'] = d['b'], d['a'] Of course, all references to the lists would have to be brokered through the mapping object, but I tend to find it easier to do so anyway.
participants (12)
-
Aahz
-
alex23
-
Greg Ewing
-
Jared Grubb
-
Josiah Carlson
-
Kristján Valur Jónsson
-
Nick Coghlan
-
Raymond Hettinger
-
Ron Adam
-
Scott David Daniels
-
Steven D'Aprano
-
Terry Reedy