On 01.06.2016 08:38, Matthew Tanous wrote:
Maybe I'm not following, but I don't see there being anything but a potential overhead penalty from copying immutable objects. If the object is immutable, copying a reference to it and copying the object itself seem transparently "identical" in terms of future use.
I acknowledge that copying the objects is a potential issue, but I think this would be solved by making the sequence repetition operator functionally equivalent to the list comprehension, such that [x()] * 5 is the same, semantically, as [x() for i in range(5)]. Alternatively, these objects could be copied in the same manner as the deepcopy functionality, although this solution may not be the best way to do it.
Ostensibly, I don't see why this wouldn't apply to all collection objects that use the sequence repetition operator (lists, tuples, etc.) to create a sequence.
I agree with your description of the current behavior as "to repeat the currently existing objects". But it seems to me that except for some extremely special cases, this limits it to immutable objects, where (somewhat ironically) there is no functional difference between repeating the objects and copying them other than slight differences in memory usage.
What you are describing is a duplication mechanism, not a repeat mechanism, so essentially reassigning the meaning of seq * number. I don't think that's in line with the way we handle backwards compatibility in Python.
Please note that there are indeed valid use cases for repeating even mutable types, namely when you don't intend to mutate the contents of the objects, but are only interested in producing a readable data structure with repeated entries, e.g. for iteration or use as multi-dimensional constant in calculations.
The duplication mechanism you have in mind can be implemented using a list comprehension, for example:
arr = [copy.deepcopy([True] * 5) for i in range(5)]
It's probably better to add a copy.duplicate() API of sorts than to try to change the * operator on built-in sequences.
On 5/31/16 1:46 AM, M.-A. Lemburg wrote:
On 31.05.2016 07:27, Matthew Tanous wrote:
Currently, the use of the (*) operator on a list is to duplicate a list by creating multiple references to the same object. While this works intuitively for immutable objects (like [True] * 5) as these immutable references are replaced when the list is assigned to, it makes the operator nigh unusable for mutable objects.
The most obvious case is when the operator is duplicated in a sequence like this:
arr = [[True] * 5] * 5
This does not create a matrix-like arrangement of the immutable truth variable, but instead creates a list of 5 references to the same list, such that a following assignment like arr = False will not change just that one index, but every 4th element of each list in the outer
It is my opinion that the sequence repetition operator should be modified to make copies of the objects it is repeating, rather than copying references alone. I believe this would both be more intuitive from a semantic point of view and more useful for the developer.
This would change the operator in a way that is mostly unseen in current usage ( * 3 would still result in [5, 5, 5]) while treating mutable nesting in a way that is more understandable from the apparent intent of the syntax construction.
- How would you determine whether a list element is mutable
or not ?
- How would you copy the elements ?
- For which object types would you want to change the behavior ?
I agree that the repeat operator can sometimes create confusing and unwanted object structures if not used correctly, but it's main purpose it to repeat the already existing objects, not to copy them, so the current behavior still is conceptually correct.