how to duplicate array entries

Alf P. Steinbach alfps at start.no
Mon Jan 11 04:03:04 EST 2010


* Steven D'Aprano:
> On Mon, 11 Jan 2010 08:56:36 +0100, Alf P. Steinbach wrote:
> 
>> * Paul Rudin:
>>> Sebastian <sebastian.langer at gmx.de> writes:
>>>
>>>> I have an array  x=[1,2,3]
>>> In python such an object is called a "list".
>>>
>>> (In cpython it's implemented as an automatically resizable array.)
>> I don't think the OP's terminology needs correction.
>>
>> A Python "list" is an array functionality-wise.
>>
>> If one isn't observant of that fact then one ends up with O(n^2) time
>> for the simplest things.
> 
> Well that's certainly not true. Some operations may be O(N**2), but 
> others are not: list.append() is amortized O(N) and for individual 
> appends, may be can be as fast as O(1).

The second sentence may or may not be true. I don't know of any fundamental 
'list' operations that have quadratic time. Is there?

The first sentence is just baffling  --  what on Earth is the "that" that you 
think is not true?

OK, I can guess (correct me if I'm guessing wrong, please): you think I'm 
talking about elementary operations. I'm not. I'm talking about algorithmic 
complexity for loops doing e.g. insertions.


>> Using the term "array" accentuates and clarifies this most important
>> aspect.
> 
> But Python lists are designed to behave as lists.

No, I'm sorry, they're not.

A Python 'list' has de facto constant time indexing, or "random access".

A linked list  --  what the informal "list" means in programming  --  does not 
have constant time indexing.

A linked list has constant time insertion.

A Python 'list' has de facto linear time insertion (except when used as cursor 
gap array, of course, or e.g. implementing a linked list on top, such things).

So in short, a Python 'list' has all the properties of arrays, and none of the 
properties of linked lists.


> Just because CPython 
> implements them using arrays doesn't make them arrays. Other Python 
> implementations might use other implementations... 

No, I'm sorry, not without screwing up existing Python programs. Indexing is 
assumed as constant time in every CPython program. That is, in your own words, 
but here correct, that's "certainly not true". ;-)

No (sensible) programming language allows a language implementation to change 
the running time of common loops from linear to quadratic.

It would be decidedly un-pythonic. ;-)


> If the creator of CLPython is out there, perhaps might like to comment on 
> whether he uses Lisp linked-lists for the Python list type?

If he does then you're talking about a different language than the one that 
CPython implements: constant time indexing is very different from linear time. 
It doesn't matter if some bananas are called oranges. They don't turn into 
oranges no matter what they're called.


>> Using the misleading term "list", even if that's the type's name in
>> Python, hides this most important aspect, and so is not, IMHO, a Good
>> Idea except where it really matters that it's a 'list' array as opposed
>> to, say, a 'tuple' array.
> 
> Or an "array" array.

For example, yes. These different kinds of arrays have different restrictions: 
can't be used as dictionary key, can't be modified, has fixed item type. And 
when talking about such characteristics the type name can be relevant.


>>>> from array import array
>>>> array
> <type 'array.array'>
> 
> 
> 
>>>> Is there an operator which I can use to get the result
>>>> [1,1,1,2,2,2,3,3,3] ?
>>> There's no operator that will give you that directly - but there are
>>> plenty of one-liners that will yield that list. e.g:
>>>
>>>>>> list(itertools.chain(*([x]*3 for x in [1,2,3])))
>>> [1, 1, 1, 2, 2, 2, 3, 3, 3]
>> And I think it's worth noting that, for the general case, this notation
>> is also hiding a gross inefficiency, first constructing sub-arrays and
>> then copying them and joining them.
> 
> I wouldn't call that a gross inefficiency -- that's a gross exaggeration 
> unless count is very large, and even then, possibly not that large. Only 
> one such sub-array (sub-list) exists at any time. (Unless I am grossly 
> misinformed.)

I'm sorry but to the best of my knowledge you're misinformed.

Unless there's some pretty advanced lazy evaluation involved the * operator has 
to collect the subarrays into an array formal argument for the 'chain' routine.

And at that point they all exist at the same time.


 >>> def knurre( *poff ):
...    print( type( poff ) )
...    print( poff )
...
 >>> a = [1, 2, 3]
 >>> knurre( *(3*[x] for x in a) )
<class 'tuple'>
([1, 1, 1], [2, 2, 2], [3, 3, 3])
 >>> _



>> It doesn't even buy clarity.
> 
> Not if you're unused to the functional, iterator-based idioms, no.
> 
> But if you are, it does.

He he  --  see above, with 99.x certainty you *misunderstood* the code.

That's *not* clear code.

That's, hereby (almost) proven :-), code that makes even experienced programmers 
misunderstand what's going on!


>> And if one absolutely feels like trading some efficiency and clarity for
>> some more functional-programming expression like thing (I don't
>> understand why people desire that!), 
> 
> I don't understand why you assume that functional forms are necessarily 
> less efficient and clear than non-functional. Which is easier to read?
> 
>>>> print sum([1,2,3])
> 6
> 
> versus
> 
>>>> total = 0
>>>> for i in [1, 2, 3]:
> ...     total += i
> ... 
>>>> print total
> 6

No, here I very much agree with you: the first is bestest. :-)

I like abstraction.

But not for its own sake: there has to be some increase in clarity, some details 
that the abstraction removes from consideration, instead of introducing 
additional details for consideration!


> [...]
>> Re the thing I don't understand: it's the same in C++, people using
>> hours on figuring out how to do something very simple in an ungrokkable
>> indirect and "compiled" way using template metaprogramming stuff, when
>> they could just write a simple 'for' loop and be done with in, say, 3
>> seconds, and much clearer too!
> 
> Amen to that brother!
> 
> It's the obsession with one-liners and the desire for a single built-in 
> command to do every imaginable task.

He he... :-)


Cheers,

- Alf



More information about the Python-list mailing list